Bey 


ON 


ah .” 
ts 
* [| 
E 
* . 
”, А 


^ x 


А NOTE ON SIR CYRIL BURT'S - 
‘FACTORIAL ANALYSIS OF QUALITATIVE DATA? 


> 


Ву LOUIS GUTTMAN % 
Scientific Director, Israel Institute of Applied Social Research, Jerusalem 
© 


I. Procedures for Dealing with Qualitative Data. II. Analysis into Principal ^ 
Components. ІП. Applications-to Scalable and Non-Scalable Data. IV. The Study. 
of Deviations : Image Analysis and Nodal Analysis. . 


I. PROCEDURES FOR DEALING WITH 
QUALITATIVE DATA 


World War II interrupted communication between scientists in different countries, 
and only gradually is the exchange of information being made up. A case in point 
is the monograph The Prediction of Personal Adjustment by Paul Horst and others, 
which appeared just before the United States actively entered the holocaust. This 
was published by the Social Science Research Council in New York as Bulletin 48 
in 1941. During the war,.only a handful of copies was able to reach Europe. In 
a recent article in this Journal (1), Sir Cyril Burt has called attention to the importance 
of developing a theory for qualitative data, as distinct from quantitative data. He 
develops a particular algebraic formulation which leads to the resolution of the data 
into principal components (latent vectors). This happens to be a topic treated also 
in the above-mentioned monograph, by the present writer. It is gratifying to see 
how Professor Burt has independently arrived at much the same formulation. This 
convergence of thinking lends credence to the suitability. of the approach. The 
purpose of the present note is to call attention to the points of similarity and to 
describe further developments which have occurred in the United States and in 
Israel in the theory of qualitative data. 


In a chapter of the above-mentioned Social Science Research Council monograph 
entitled * The Quantification of a Class of Attributes ” (2), it was proposed that qualita- 
tive data could be recorded in a manner amenable to treatment by matrix algebra. 
In form, the matrix M on page 326 of the monograph is identical with Table I of 
Professor Burt's article ((1), p. 171). My own paper proposes three different kinds of 
problems of quantification : ` 


(a) Deriving quantitative scores for the respondents, to maximize a certain correlation 
ratio (for the rows of the matrix). 
(b) Deriving quantitative weights for the categories of the qualitative items, to maximize 
` another correlation ratio (for the columns of the matrix). 


(c) Deriving scores and weights simultaneously, to maximize a certain correlation 
coefficient (for rows and columns of the matrix simultaneously). 


It was proved that all three problems are essentially equivalent. The maximiz- 
ing scores and weights for the third problem are also the maximizing ones for the 


1 


A Note on Sir Cyril Burt’s * Factorial Analysis of Qualitative Data ^ 


first and second problems respectively. Furthermore, the maximum correlation 
ratios and coefficient are always mutually equal for the same data. 


ІІ. ANALYSIS INTO PRINCIPAL COMPONENTS 


As usual in analysing internal consistency, the equations sought lead not only 
to a maximum, but also to a series of other stationary values down to a minimum. 
These various solutions provide a principal component system. ` There is a set of 
principal component scores for the respondents, and a set of principal component 
weights for the categories of the items. Professor Burt has arrived at the latter 
solution, namely, that for principal component weights. 


My own article goes on to point out that, while the principal components here 
are formally similar to those for quantitative variables, nevertheless their interpreta- 
tion may be quite different. The interrelations among qualitative items are not 
linear, nor even algebraic, in general. Similarly, the relation of a qualitative item to 
а quantitative variable is in general non-algebraic. Since the purpose of principal 
components—or any other method of factor analysis—is to help reproduce the 
original data, one must take into account this peculiar feature. 


The first principal component can possibly fully reproduce all the qualitative 
items entirely by itself : the items may be perfect, albeit non-algebraic, functions of this 
component. Linear prediction will not be perfect in this case, but this is not the 
best prediction technique possible for such data. "Therefore, if the first principal 
component only accounts for a small proportion of the total variance of the data 
in the ordinary sense, it must be remembered that this ordinary sense implies linear 
prediction. If the correct, but non-linear, prediction technique is used, the whole 
variation can sometimes be accounted for by but the single component. In such 
à case, the existence of more than one principal component arises merely from the 
fact that a linear system is being used to approximate a non-linear one. (Each item is 
always a perfect linear function of а// the principal components taken simultaneously.) 


ПІ. APPLICATIONS TO SCALABLE AND 
NON-SCALABLE DATA 


This thinking was carried further with the definition of a perfect scale (3). in 
a perfect scale, each item is a perfect function of a single rank order of respondents. 
The functional relation is not algebraic, although of an extremely simple nature. 
Applying the general equations of principal components developed in (2) to the case 
of the perfect scale defined in (3) leads to some most extraordinary results. There is 
a definite law of formation for the entire set of principal components for scores, and a 
corresponding law of formation for principal component weights. Indeed, the 
matrix equation for the principal components in each case transforms into a 
difference equation. For dichotomous items, this is a second order linear difference 
equation which, in the limit, for an infinite number of items, yields a second order 
differential equation that is classical in mathematics and physics. The principal 
components turned out to be sets of orthogonal functions, like the Legendre poly- 
nomials, Hermite polynomials, trigonometric functions, etc. 


The second principal component in all cases is a i 

: om] perfect U- or J-shaped function 
of the first. The third principal component is also a perfect function of the first, 
but its graph shows a curve with two bends (for example, a cubic parabola). The 


2 


L. GUTTMAN 


linear correlation between any two principal components is always zero, because 
they are orthogonal to each other (and have zero means) ; but the curvilinear correla- 
tion of each with the first component is perfect. In this sense, the further principal 
components beyond the first add no new information. 


In another sense, it has been found that the further principal components provide 
a psychological frame of reference for scalable data. The second principal component 
for scores has been identified for many attitudinal data as the intensity of attitude (5). 
More recently, the third and the fourth component for scores have been found to 
portray somewhat new psychological concepts. which have been named closure 
and involution, respectively (9). 


For non-scalable data, the principal components will not, of course, have a 
law of formation such as that just described, and indeed may have no useful law at 
all. Progress has also been made in developing meaningful structural theories 
for non-scalable qualitative data, and here various new kinds of systems of principal 
components appear possible. For the beginning of these developments, which go 
under the general name of the ‘ theory of facets,’ see (10). 


An unexpected consequence of these discoveries for qualitative data has been 
the rethinking it has stimulated about factor analysis in general. A new approach 
has been developed for qualitative data with linear regressions, which also leads to 
laws of formation of principal components. The general theory is called that of 
the radex and it contains the simplex and circumplex as subtheories. Empirical data 
tend to confirm this new approach for the case of tests of mental abilities (11). 


IV. THE STUDY OF DEVIATIONS: IMAGE 
ANALYSIS AND NODAL ANALYSIS 


The above-mentioned type of difference equation, and its lawful principal 
components, holds for the case of a perfect scale. An important problem that must 
be considered arises from the fact that perfect scales are rarely, if ever, to be found 
in practice. More typical may be the case of quasi-scales, or scales * plus’ error. 
Here, then, it becomes important to distinguish between the structural and the 
deviant aspects of the observed data : only the former will have a law of formation. 

Two new general approaches for dealing with the problem of deviations from 
perfect structures are called image analysis and nodal analysis respectively. Both 
of these are useful for many kinds of non-scale structures as well as for quasi-scales. 

One image technique, Israel Alpha, for studying quasi-scales has been described 
in (14). The Israel Beta technique has also been used successfully, and will be 
described in future publications. а f EE 

Nodal analysis can be regarded as a generalization and simplification of 
Lazarsfeld's latent structure theory. It leads to easy computing routines, whether 
for quasi-scale or non-scale structures (15). "C 

ere we cannot do more than to call attention to these newer developments, 
VN which pem not yet been published in much detail. The work done thus 
far in probing the possibilities of structural theories for qualitative data seems to 
indicate that we are merely scratching at the beginnings of a rich and intricate subject, 
which fully deserves to be studied in its own right. 


AI 3 


A Note on Sir Cyril Burt's * Factorial Analysis of Qualitative Data’ 


REFERENCES 
1. Burt, Cyril (1950). “Тһе factorial analysis of qualitative data.’ Brit. J. Psych., Stat. Sect., 
Ш, 166-85. 


2: Guttman, Louis (1941). “Тһе quantification of a class of attributes,’ in Horst, P., et al- 119 
Prediction of Personal Adjustment. New York : Social Science Research Council, 319 iA 
3. Guttman, Louis (1944). ‘A basis for scaling qualitative data.’ American Sociological Review, 
IX, 139-50. 3 м 7 
4. Guttman, Louis (1947). ‘The Cornell technique for scale and intensity analysis.’ Educ. 
and Psychol. Measurement, ҮП, 247-80. А E PO 
5. Guttman, Louis, and Suchman, E. A. (1947). “ Intensity and a zero point for attitude analysis. 
Amer. Sociol. Rev., XII, 57-67. = rg А е 
6. Guttman, Louis (1947). ‘Scale and intensity analysis for attitude, opinion, and achiiovemeni 
in Kelly, George A. (editor). New Methods in Applied Psychology. Maryland: University 
of Maryland Press, 173-80. nS 2 ffer, 
7. Guttman, Louis (1950). “Тһе principal components of scale analysis,’ ch. 9 in Stouffer, 
S. A, et al. Measurement and Prediction. Princeton : University Press. Am 
8. Guttman, Louis (1951). * Scale analysis, factor analysis, and Dr. Eysenck,’ Internat. J. Opinion 
and Attitude Research, 5, 103-20. А i F 
9. Guttman, Louis. “Тһе principal components of scalable attitudes,” in Lazarsfeld, Paul F. 
(editor). Mathematical Thinking in the Social Sciences. New York: Columbia University 
Press (in the press). 
10. Guttman, Louis. * The theory of facets.”  /bid. 
11. Guttman, Louis. “А new approach to factor analysis : the radex. Ibid. И. 
12. Guttman, Louis, and Suchman, E. A. (1947). “А solution to the problem of question “ bias. 
Public Opinion Quarterly, XI, 445-55. Е 
13. Guttman, Louis, and Foa, Uriel С. (1951). * Social contact and an intergroup attitude.’ Public 
Opinion Quarterly, XV, 43-53. , 
14, Guttman, Louis (1951). The Israel Alpha technique for scale analysis (stencilled). Available 
оп request from the Israel Institute of Applied Social Research, 
15. Guttman, Louis. ‘ The theory of nodal analysis ” (in Preparation). 


Vol. VI The British Journal of Statistical Psychology May; 
Part I 1953 


SCALE ANALYSIS AND FACTOR 
à ANALYSIS 


Comments on Dr. Guttmams Paper? 


By CYRIL BURT 
University College, London 


I. The Analysis of Answer Patterns. Il. Criticisms of the Factorial Approach. M. Factor 
Analysis of a Typical Scale Pattern. 1V. Factor Analysis of the Corresponding Answer 


Pattern. V. Summary. 


I. THE ANALYSIS OF ANSWER PATTERNS 


Quantitative and Qualitative Data. It is encouraging to find that Dr. Guttman has 
discerned points of resemblance between the methods we have independently reached for 
analysing qualitative data; and I willingly agree that this convergence in our lines of approach 
lends additional plausibility to the results. If, as I gather, he cannot wholly accept my own 
interpretations, that perhaps is attributable to the fact that our starting-points were rather 
different. My aim was to factorize such data ; his to construct a scale. In the paper * to 
which he has referred, I sought to develop a practicable procedure for discovering the factors 
underlying qualitative characteristics, and then to apply that procedure to, certain recurrent 
problems of my own. Dr. Guttman’s purpose, as the title of his previous contribution 
indicates, was to present ‘a theory and method of scale construction” by means of 
° quantifying a class of attributes.’ I have myself commented 3 on the similarity between my 
equations for factor-measurements and the equations for component scores given by Dr. 
Guttman, and ascribed it to the fact that his mode of deriving components seemed identical 
with the method proposed by Pearson for calculating * index characters '—a method which 
had formed the original basis for my own factorial procedures. 


,, 4, Dr. Guttman's technique of scale analysis has been developed primarily in connexion with 
“the study of the internal structure of a universe of attitude and opinion items ; and the data 
he has used to illustrate his procedure consist, as à rule, of responses to questionnaires. I myself, 
however, first encountered the type of. problem which he has discussed in a rather different field. 
namely, in the attempt to refine the scaling of the original Binet-Simon intelligence-tests. Before 
the publication of Binet’s 1911 version, British psychologists relied chiefly on tests of an * internally 


graded ’ type—opposites, dotting, card sorting, and the like. Binet introduced a composite échelle 
comprising a number of * externally graded items,’ i.e., questions OF tasks for which the responses 
ly marked * right ’ or * wrong. 4 Thus, 


differ not in quantity but only in quality, and which are simp! 


II am much indebted (о Dr. Guttman for his kindness in reading my manuscript and to Dr, M, L. 


Fraser for his help in checking my calculations. ; 
2 This Journal, ІП, pp. 166-85. Ав there explained, the theoretical part of the paper was based 
on a technical memorandum prepared during the early stages of the war for the Department of 


Personnel Selection of the War Office. The factorial procedure had already been applied to qualita- 
the Binet-Simon Scale, for which it was 


tive physical characteristics, and to data obtained from 


originally developed. 
3 This Journal, V, p.200. Dr. Guttman observes (p. 2 above) that the method I have followed yields 
the same solution as his for the * principal component weights.) But, as I shall show іп a moment, 
there is (with a suitable standardization of the data) a still more striking resemblance between the 
figures reached by the two different methods for the * principal component scores. 

4 Бог a discussion of the principles underlying the two types of test, see Burt, C. (1914), ‘ The 
Measurement of Tatelligence! by ibe Binet Tests,’ Eugen. Кеу., VI, рр. 36-50, 140-52. 


: 5 


Scale Analysis and Factor Analysis 


with an internally graded test the numerical data consist of measurements ; with the externally 
graded, of frequencies. Hence for the former the appropriate statistical methods appeared to be 
those developed for ‘quantitative variables” (e.g., the ordinary product-moment correlation) ; 
for the latter, those described by Yule for what he called ‘ attributes * (e.g., methods of contingency 
and association). 

In the chapters contributed to Measurement and Prediction (18) both Dr. Guttman and 
Dr. Lazarsfeld draw a sharp distinction between the principles involved in these two cases. Factor 
analysis, they maintain, has been elaborated solely with reference to data which is quantitative 
ab initio ; hence, they suppose, it cannot be suitably applied to qualitative data. On this side of the 
Atlantic, however, there has always been a tendency to treat the two cases together, and, with this 
double application in view, to define the relevant functions in such a way that they will (so far as 
possible) cover both simultaneously.? British factorists, without specifying very precisely the assump- 
tions involved, have used much the same procedures for either type of material. Nevertheless, there 
must of necessity be certain minor differences in the detailed treatment. These were briefly indicated 
in the paper Dr. Guttman has cited ; but they evidently call for a closer examination. I think in 
the end it will be found that they are much slighter than might be supposed. 


Example, With qualitative material, the special problems that arise and the general 
mode of approach can be best explained if we start with a concrete illustration. Suppose 
we want to classify the children in some particular age-group—all those aged six, let us say— 
into 5 grades of ability by means of a series of dichotomous test-items like the questions in 
the Binet Scale. If each item constituted a pure and reliable test for the ability to be assessed, 
then, arguing along the lines of * information theory, ? and proceeding by a progressive 
subclassification, we should evidently need six items altogether : namely, (i) a test like 
5 Numbers, which would separate those below average from those above, i.e., those with a 
mental age of less than 6 from those with a mental age of 6 or more ; (ii) a test like 4 Numbers, 
which would divide those with a mental age below average into those with a mental a 
less than 5, and those with a mental age of 5 to 6 
would divide those with a mental age of less than 5 into those with a mental age of less than 
4, and those with a mental age of 4 to 5. Similarly, we should require three more tests to sub- 
classify those who were above the average for age 6. 

Tn an experiment planned to study the working of the tests, the data obtained would 
be entered on a mark sheet similar to that shown in Table I. This is a special case of the 
more general table 5 printed in my previous article ((17), Table I). Now, however, each 

determinable” presents two ‘ determinates ° only, i.e., each item yields two * categories,’ 
namely, * pass ° (marked 1) and ‘ fail’ (marked 0). British investigators * describe such 
tables as * answer patterns’; and, from the early days of mental testing, they have devoted 
a good deal of attention to the simpler types of pattern which the answers from these non- 
graded tests appear to display. 

Table I records the results obtained with six tests from the London version of the Binet 
Scale, when applied to a representative sample of 6-year-old school children, from whom, 
for simplicity, I have chosen 10 individuals only. It may be compared with the typical table 
ae by Dr. Guttman in his discussion of the “components of scale analysis’ ((18), p. 317, 

able I). 


,, Тһе method of recording differs from „his in two minor respects. First of all, in accordance 
with the principle followed by most practical test-constructors, the battery is designed to contain 
at least one item so easy that all the children can answer it and at least one so difficult that all are 


ge of 
; and (iii) a test like 3 Numbers, which 


* See, for example, (14), I, рр, 15f., where the Stieltjes integral is used to facilitate this combined 
treatment. The work of the French school (Borel and his colleagues), by deriving the theory of 
measurement from the theory of sets, makes the relations still simpler : (see (15), opening chapters). 


yet more recent and relevant example is Kendall's * general coefficient of correlation * which 
applies equally well to measurements, ranks, 


and dichotomous classificati .G. 
Rank Correlation Methods, 1948, pp. 17f.). еда E 
MEME e ЕШ Journal, TV, р. 196, ape реперів involved are analogous to, but of course not 
a with, the conversion of an integral measurement from the “с а ation ' 

to the ‘ binary ' where only two digits, 0 and 1, are used. Ia ee ee 
Dr. Guttman says in his paper (p. 1 above) that this table is identic 
on p. 326 of the monograph." I should have th 
first table in his monograph (p. 321), since that (li 
require to analyse, and shows the various items ex) 
The same mode of tabulation is adopted by Fer, 


6 


1 al in form with “ the matrix M 
ought it was rdther to be compared with the very 
ke mine) presents the initial record sheet which We 
plicitly classified into categories and subcategories- 
guson and other British writers : cf, (6) and (11). 


C. BURT 


TABLE L ANSWER PATTERN FOR DICHOTOMOUS TESTS 


Persons | 


Tests Total Differ- 
A.A. B.A. BB. C.A. СВ. C.C. C.D. D.A. D.B. ЕА) ences 
Repeating— | | | 
po EN ML 6 уу ub cn. d om 10) | 
4 Numbers. . l-o- Gre aii c m ar ТТ т oh | 1l 
5 Numbers. . 0 0 0 1 1 1 1 1 1 1 Th } ЕР 
6 Numbers.. о o uw O O Po» n 3 5 
7 Numbers. . | 0 0 0 0 0 0 0 0 0 1 1 ) ЕСУ 
8 Numbers. . 0 0 0 0.0 0 0 02 (rip « 1 
Toul cs Pl ЗИ. В. 1 а. 9 | 39 | 
Deae. ОООО ONLUS | | 
| Е ف‎ ---- — | 
Frequencies .. 1 2 4 2 1 10 | 


likely to fail. Hence N + 1 tests are deemed necessary to discriminate N types of persons. For 
theoretical work, however, where we are more interested in factorizing items than in grading persons, 
it is better to include п + 1 types of persons in order to analyse the answers from л tests. 
Secondly, the ordinary answer pattern gives only the positive replies, whereas Dr. Guttman enters 
negative as well as positive, and thus has 2n categories for 1 tests. It will be convenient to keep his 
phrase— scale pattern ’—to designate his type of table, and to reserve the phrase * answer pattern 2; 
to designate а mark sheet of the more familiar type. 

Problem. Тп the earlier investigations of the Binet and similar tests, one of my chief 
objects was to determine the factors in terms of which such answer patterns could best be 
described. The factors appeared to be of two kinds—' content factors’ and * difficulty 
factors.’ ! Could we so purify our scale that it was virtually a perfect instrument for measurin 
* general intelligence,’ all factors resulting from special abilities would be eliminated. Іп 
such a case there would still remain two obvious factors for which we should wish to secure 


specifications—the general factor of * difficulty,’ common to all the tests, and the general 


content to give each person and each item an equal weight, 
lations required for a full im in Boas, ae assess the 
difficulty of the i imply adding the number of correct replies O tained for each item, and the 

Ticulty of the items by simply 5 ics given by each child: (cf. the 
ttern above plainly indicates, there is a certain reciprocity 


between the two factors ; and, indeed, either concept may be 

The Unique Answer Pattern. When the sample o 
have been picked more or less at random, then the frequency- 
approximately normal or at least bell-shaped. Іп practice, however, an attempt is generally 
made to choose items that will yield a rectilinear scale. Binet's method of standardizing 
his problems by grouping children according to chronological age was intended to provide 
a rectilinear distribution for persons as well (or rather for categories of persons). Evidently, 
if we had one perfectly reliable item for each level of difficulty and one perfectly representative 
child for each level of ability, the resulting answer pattern would then be uniquely determined. 

Lei i a double selection of items and persons on an answer pattern 
‘like hat: sae авес the ees ee items have. been efficiently selected, but the children are 


unselected (or just randomly selected from each age-group), then the line dividing successes from 
failures dal ODD f normal ogive. In such а case, if the number of children is large, 
the table showing the answer pattern is usually condensed so as to allot a column to each age-group 


1 The ° difficulty ° factors found in such analyses have been criticized by Mr. Gourlay as artefacts. 


Бог his critici this Journal, IV, pp. 65-76. . 
T Burt, C. "Test Construction ind the Scaling of Items,’ this Journal, ТУ, 1951, p. 106, Sect. ІП (а), 


7 


Scale Analysis and Factor Analysis 


test comprised of a perfectly consistent set of items would istincti 5:1 
group of persons within the sample, that order should always remain the same, no matter how the 
sub-group is selected. (ii) The N; persons passing any given item will always be the N; most able 
persons in the group, no matter what Ni may be. (iii) A total of х 
by passing the х least difficult items in the series, no matter what x m: 
frequencies in the distribution of persons (shown near the bottom of 


the first differences between the Successive numbers of persons passing each test (shown at the side 
of the table). Thus, as Walker points out (6), 


except so far as there may be ties for individuals having 

the same level of abilit ; Or for items belonging to the same level of difficulty, the answer pattern 

will be completely fixed by the order of the items and the order of the persons. If, therefore, the 

frequency distribution of the persons is stated, we can deduce from the total mark gained by any one 
person, not only how many items, but also which items, he has in fact passed. 

sistency. With the majority of the pass-or-fail tests in common use, the 

and the resulting pattern never displays the Perfect regularity 


. For such irregularities there are two main Causes, 
might be called an internal un 
may, for example, begin by 


: are Both persons and items suffer from what 
reliability. To Some extent this can be obliterated by Brouping. We 
grouping persons according to age, ability, sex, social status, and the like. 

i groups : (cf. the tables of rank-differences 
i targe enough to be statistically significant, such variations imply 
(in the language of analysis of variance) an * interaction ^ between the special abilities or special 
attainments of the persons a For certain statistical purposes, 
the differences can be still further smoothed out by grouping the items. But for practical purposes 
rd them as a cue for improving the way the items have 


e id, and (since they inevit- 
ably xeriminative (cf. (11), рр. 55-8). It therefore becomes 
desirable to find Some method for measuring what I have called the * homogeneity ’ and 

heterogeneity of any test-battery proposed for practical use, 


1 n the responses obtained with ірі sion of 
the scale were Pointed out in an early report. Thi paths original versio 


t 1 ^ a е Most conspicuous occurr i- stic 
items like Transcription and Reading—tasks which nearl n a өші сін 


8 


С. BURT 


2 The Measurement of Heterogeneity. The general characteristics of consistent and incon- 
sistent answer patterns will now be clear. As we have seen, when the tests are arranged in 
order of difficulty and the persons in order of ability, then the answer pattern for an ideal test 
should display a solid block of 1’s above the diagonal line of division (i.e., in what I shall 
call the * success triangle °) and a solid block of 0% below it (in what I shall call the * failure 
triangle"). Inconsistencies will be indicated by erratic 0’s appearing in the former area and 
erratic 1’s in the latter ; and the pattern will in consequence no longer show a sharp or cleanly 
defined line separating all the individual passes from all the individual failures. The effect 
of these irregularities Thomson designated by the term ‘ higgledypiggledyness.’ Any 
Coefficient obtained by counting up their number (or some equivalent procedure) he proposed 
to call a ‘ coefficient of hig ’ and its complement a * coefficient of unig ° (uniqueness). The 
words he has coined avoid the ambiguities of vaguer terms, like ° heterogeneity ^ and 

inconsistency,’ which are used in so many contexts with so many different meanings. 

With a perfectly haphazard distribution of answers we should expect half the total 
number of successes to be scattered over one triangle, and half over the other. An obvious 
index of homogeneity or * unig ' (U) is therefore given by the formula 
U= No. of Actual Successes in Success Triangle —} Total No. of Successes à 
My 1 
1 Total No. of Successes 


Writing this in the form 


U= 4 x No. of Successes in Failure Triangle 1 47е; 
p Total No. of Responses nx N’ 


we see that it is virtually the same as Guttman’s ° coefficient of reproducibility ' ((18), p. 77). ° 


In practice, however, with the ordinary type of composite test, such a coefficient does 
not prove altogether satisfactory. For one thing, with the same set of items, it varies accord- 
ing to the distribution of the sample of persons. In such a case an alternative method is to 
compare the frequencies for the individuals with the first differences between the totals 
for the tests ; and accordingly, after considering various possibilities, Walker proposed a 
“coefficient of hig’ based on the sum of the squares of the discrepancies between the two 
sets of figures. But even this, as he himself points ‘out in a more recent paper, is not entirely 
free from objection (6). 

There are several other ways in which a coefficient might be conceivably developed. We might, 
for instance, use any of the coefficients of intrinsic reliability, e.g., the split half method, or the 
variance method which I have described as giving the mean value for all possible split halves." Or 
again we might construct from the data the ideal answer pattern which gives the closest fit, and test 
the discrepancies by the square contingency (^ chi-squared ’) ; the agreement could then be measured 
by the root mean square contingency o. When the scales or questionnaires include a large number of 
items, such procedures become decidedly laborious ; and probably most British writers would 
Concur with Ferguson's conclusion : “ at present no convenient measure is available for estimating 
the divergence of an obtained answer-pattern matrix from a theoretically unique matrix ” ((11) p. 54). 


The Advantages of a Factorial Analysis. For general purposes, the best course, in 
my own view, is to undertake a factorial analysis of the data obtained. Since we are 
interested primarily in difficulty factors, the natural approach would be to correlate persons. 
If the persons are grouped in categories, the matrix of correlations or product-sums need 
not be large. Indeed, it will often be sufficient to compare the rank-orders of the several 
items for the different groups ((4), pp. 132f., 146f.). For more intensive studies, a complete 
analysis is desirable,? particularly as that may incidentally shed light on the presence of 


! Burt, C., “Тһе Reliability of Teachers’ Assessments," Brit. J. Educ. Psychol., XV, 1945, pp. 91-2. 
"The formula giving the mean for all possible split-halves appears to be essentially the same as that 


given by Kendall for the * coefficient of concordance ’ ((14), p. 411). | d j 
2 This В АЕА the data relate to non-cognitive characteristics, e.g., social attitudes, 
Rorschach answers, and the like. Many of the items in a case-history are as a rule recorded only 
in terms of presence or absence—e.g., the occurrence of some accident, disease, or hypothetical gene, 
and especially certain recurrent features in the social environment—parents or relatives of this or 
that type, membership of this or that group or circle, contact with this or that member of the group. 

: t-and-one ° record-table can be treated like a * measurement 


In all such cases the resulting ‘ nough н А 7 
matrix ’ апа factorized in AD manner described ; and the results obtained often yield suggestive 
ways of describing the intricate pattern of a life history or a social group (cf. (17), p. 166 and refs.). 


9 


Scale Analysis and Factor Analysis 


irrelevant content factors. In sucha case it will be better to start by correlating or comultiply- 
ing items rather than persons. The method of weighted summation, with unreduced item- 
variances, should be employed by preference. But simple summation will furnish a good 
approximation. The working procedures, and the results obtained in the case of empirical 
test-items, have been sufficiently discussed and illustrated in earlier papers (cf. (13), (15), (17). 
and refs.). In view of recent criticisms, however, it seems desirable to consider more closely 
how far it may be legitimate to treat qualitative data in this fashion, and particularly whether 
such methods can yield satisfactory results in the case of a perfect scale. 


ІІ. CRITICISMS OF THE FACTORIAL APPROACH 


The Relations between Scale Analysis and Factor Analysis. In his very instructive paper 
(pp. 1-4) Dr. Guttman has indicated what he takes to be the differences, as well as the 
similarities, between his approach and my own. He observes that, while the ‘ principal 
components ° employed in his method of scale analysis may be * formally similar ' to those 
employed in factor analysis, “ nevertheless their interpretation may be quite different." 


The differences are examined more fully in his earlier contribution on * The Relation of 
Scalogram Analysis to Other Techniques.’ 1 


If I understand Dr. Guttman rightly, his objections turn on five main points. 


1. First, he holds that factor analysis is “ designed only for quantitative variables," and is 
consequently unsuited for qualitative data ((18), p. 191). The method, he says, “ originated as a 
single factor theory by Spearman, and was developed into a multifactor theory by Thurstone and 
others." Were his own procedure to be described in factorial terms, then, he adds, we should have 
to treat it as “а single factor theory for qualitative data.” 


However, as other writers have pointed out (cf. this Journal, V, p. 206), such statements limit 
the term ‘factor analysis’ to very specific forms. In point of fact, the technique now generally 
known as factor analysis is much older than Spearman’s procedures. It originated with Pearson’s 
proposal to reduce a given multivariate distribution to terms of the ‘ principal axes of the frequency 
ellipsoid’ and take these axes as representing * index characters. It was Pearson’s investigation of. 
the general problem that really supplied the earliest “ algebraic formulation, leading " (if I. may 
borrow Dr. Guttman’s phrase) “(0 the resolution of the data into principal components (latent 
vectors) ” ((2), рр. 559f. ; cf. (15), p. 309). Factor analysis was thus a multifactor method from the 
start. Spearman’s single factor method was developed several years later as a substitute, because he 
held that Pearson’s approach was unsuited to the data obtained in psychology. Nevertheless, in 
Spite of his criticisms, numerous researches were carried out in which Pearson’s method was applied, 
not only to quantitative measurements, such as those furnished by graded tests, but also to qualitative 
data, such as were supplied by dichotomous test-problems and by questionnaires. In a footnote, 
Dr. Guttman ((18), p. 193) refers to the treatment worked out by Yule (Karl Pearson’s assistant) 
for dealing with frequency-tables for qualitative variables (3), and considers it “ strange that statistical 
text-books in the social sciences have not followed suit, but fail to discuss material of this kind at all.” 
But as a matter of fact in this country psychologists have made free use of Yule’s procedures, especially 
in relation to social data ê; and numerous theses could be cited where factorial methods have been 
applied to tables of frequencies, contingencies, or Yule’s coefficients of colligation or point-correlation. 
In particular, * answer patterns" have regularly been subjected to a factorial analysis by various 
devices (cf. (6), p. 326, (11), p. 52, and refs.). 


1((18), pp. 191f.: in the same volume Dr. Lazarsfeld also brings forward somewhat similar 
criticisms ; cf. ibid., pp. 469f.). 

2 [n dealing with the item-responses obtained with the Binet Scale (where the classification was in 
terms of pass or fail) and with many social conditions affecting delinquency (where the classification 
was in terms of presence or absence) Yule's formule were used at a very early stage (cf. Mental 
and Scholastic Tests, 1921, pp. 217f., and The Young Delinquent, 1925, pp. 54f.). The tetrachoric 
coefficient advocated by Pearson and criticized by so many recent writers came into vogue somewhat 
later (cf. (13)). Yule's method of treating frequency-tables in cases of manifold classification involves 
fitting the matrix of observed frequencies with a matrix of expected frequencies derived by a procedure 
identical with that of centroid analysis. And when, instead of the more laborious method of 
* weighted summation’ entailed by Pearson's equations, I put forward the formula for ‘ simple 
summation '—the formula afterwards termed the ‘ centroid formula ° by Thurstone—I pointed out 
that it was, in effect, an extension to quantitative variables of the usual formula for investigating 
association in the case of qualitative attributes (cf. (3), p. 64, eqns, (1) апа (2)). 


10 


==. 


C. BURT 


2. Secondly, Dr. Guttman argues that, in order to а i і 
S y, Dr. : t, pply factor analysis, we must 
calculating correlation coefficients, and that in the case of NR data SU Сасы ae cae 
але eading. With his criticisms of the uses made of the tetrachoric coefficient and the point 
Sse I very largely agree. Yet his arguments seem only to prove that these coefficients are not 
Fe оГ all occasions ог for every purpose. There is one coefficient which he does not explicitly 
CDU е ordinary product-moment coefficient applied to the data after they have been trans- 
(os Le standard measure ; and this, which has been frequently used for such problems, yields 
ee a show in a moment) results that are virtually the same as his. Nevertheless, nothing in the 
facto о СЕ analysis confines its application solely to coefficients of correlation. Indeed, for the 
the ist there is often a special advantage in working with frequencies, since (as 1 pointed out in 
paper to which Dr. Guttman refers) the higher order frequencies may prove especially serviceable 


in the calculation of group factors.* 


But pr His third argument runs as follows. The principal criterion for scalability is reproducibility. 
merit ctor analysis does not allow us to reproduce the original data from the so-called factor-measure- 
m 115%; Непсе factor analysis сап never show whether a scale is perfect or not. This objection, 
{5 wever, applies only to the special procedures advocated by Spearman and Thurstone, which seek 
15, analyse, not the total variance, but merely the common factor-variance. The method of principal 
with a the other hand, requires the full test-variances to be retained in the covariance matrix ; and 
m earson's procedure the factor-measurements are obtained by pre-multiplying the initial measure- 
eun by the matrix of direction cosines (the latent vectors). Now such a matrix is necessarily 
m ogonal. Hence its transpose can be used as a second pre-multiplier to reproduce the initial 
measnremenls from the factor-measurements (cf. (10), Appendix II). An exact reproduction is 
= erefore possible. If, then, we can also show that, with a perfect scale, one of the factors so obtained. 

in perfect correlation with the rank of the persons, it would seem that the method can after all 


provide an entirely satisfactory criterion. 
as Dr. Guttman says, “ is com- 


д. “ The Spearman-Thurstone approach to factor analysis,” as , 
pletely linear, and is therefore not adequate for analysing the curvilinearities inherent in the scale | 
panera, Certainly in that mode of approach the factor-measurements are always estimated by 
ale method of linear regression, as developed in Pearson’s earlier papers. But Pearson himself 
so elaborated a method for dealing with curvilinear regressions. His treatment was intended 
Primarily for problems involving an external criterion ; but it is equally applicable to the case of 
an internal criterion or factor. Elsewhere I have argued that it is quite unnecessary to restrict the 
theory of factor analysis to linear relations only ((10), р. 258); and, with the aid of the orthogonal 
i it is simple to estimate factor-measure- 


Polynomials used in the theory of curvilinear regression, ^ d 
„measurements on the assumption of non-linear relations. 


a scale analysis it can be known what a factor 


Пу be difficult, if not impossible, to know what 
о apply a factorial procedure to his 


Ments from test data or test data from factor 
апар Finally, Dr. Guttman concludes that “ from 
analysis will show ; from a factor analysis it will usua 
a scale analysis will show.” То determine this point Г propon t э 
Own table, and see how far the results achieved are similar to those reached by his own scale analysis. 
A concrete numerical example will probably help best to explain how far our methods are 
Similar and in what ways they seem to differ. 


A TYPICAL SCALE PATTERN 


_ An Answer Pattern for Five Items. reproduced the ‘ scale pattern * 
Which Dr. Guttman А in his chapter on “Тһе Principal Components of Scale 
Analysis’ ((18), pp. 317, 333) to illustrate his own procedure. It represents a perfect scale 
formed by the responses of 6 ‘ types of person > to five test-items classified under 10 * types 
of category’: (for brevity I shall refer to them simply as “ persons and ‘ categories ) In 
the last line I have appended the yalue to which the entries should be changed in order to 
Secure the same square-sum (1:000) for each category. 
quoted gave a worked exam, 
) 


ПІ. A FACTOR ANALYSIS OF 


ple from the Binet tests, and 


* (17), p. 181. The memorandum already 
2 algebraic proof (summarized in this Journal, V, P- 3 аар K., ‘ On the G 1 
The earliest sys i f curvilinear regression is to be found in Pearson, K., “On the Genera 
Theory of Seaton ad Non-Linear Regression," Biometric Series, 11, 105 Я a саны) s 

On a General Method of Determining the Successive Terms іп а Skew Regtession Line, iometrika, 
XIT, 1921, рр. 296f. 


? See (14), П (esp. pp. 158f., * Curvilinear 


Regression : Case when the Independent Variate proceeds 
by Equal Steps,’ and refs.), (5), and (8). Greenlea! 


f (7) also includes useful refs. 
11 


Scale Analysis and Factor Analysis 


This table corresponds to Table I in my previous article ((17), p. 171). There, however, the possi- 
bility was envisaged of more than two categories for each item, and those referring to the same 
item were kept together. Here there are two categories only—positive (e.g., correct answers) 
marked a, and negative (e.g., incorrect answers) marked b. Adopting Dr. Guttman’s mode of 
tabulation, I have now grouped all categories of the same type together, regardless of item, and 
have numbered the items from 1 to 5 as their difficulty decreases, and the persons from 5 to 0 as their 
ability decreases. The reader may perhaps find it easier to interpret the factorial procedure if he 
thinks of the items as five test-problems from an ideal version of the Binet Scale, applied to typical 
children of six successive mental ages. 


TABLE П. SCALE PATTERN FOR FIVE DICHOTOMOUS ITEMS (GUTTMAN) 


| Categories | Frequency of 
Persons 
53 | da За За 4а ба Jb 2b 3b 46 Sb | Response 
5 | 1 1 1 1 1 0 0 0 0 0 5 
4 0 1 1 1 I 1 0 0 0 0 | 5 
3 0 0 1 1 1 1 1 0 0 0 5 
. 2 0 0 0 1 1 1 1 1 0 0 5 
1 0 0 0 0 1 1 1 1 1 0 5 
0 0 0 0 0 0 1 1 1 1 1 5 
Frequency of | 
Response | 1 2 3 4 5 5 4 3 2 1 30 
Standardized Vee ees Me DÛ 2 
Value (1/4/ №) V2 3 у4 V5 V5 V4 X3 V2 


Factor Analysis. The method of weighted summation involves analysing the symmetric 
matrix of standardized product-sums into its latent roots and latent vectors (cf. Burt, Factors 
of the Mind, Appendix II). Accordingly we begin by calculating a symmetric contingency 
table, i.e., the matrix of product-sums for items, C; = М.М, (U7), p. 172, eqn. (i)) ; and then 
standardize ! it in the usual way by pre- and post-multiplying it by a diagonal matrix, D>, 
where the elements of D? are 4/N; (where N; denotes the number of persons in each 
category). The figures thus obtained are shown in Table III : (to save space ] omit the last 4 
rows, since they are obvious from the fact that the complete table must be axisymmetric). 


TABLE Ш. STANDARDIZED MATRIX OF PRODUCT-SUMS (М) 


Category | la 2a Ja 4a да Ib 2b 3b 4b 5b 


la 1-000 “707 "TI -500 447 -000 ‘000  -000 000 :000 


2a “707 1:000 “816 ۰707 ۰632 316 000 -000 000 :000 
Ja [6 55/7. "816 — 1:000 :866 775 -516 289 000 000 :000 
4а | 7500 “707 :866 — 1:000 “894 “671 500 -289 -000 :000 
ба 447 :632 “715 ۰894 1-000 800 -671 “16 -316 :000 


1b | 000 "316 516 “671 :800 L000 -894 -775 -632 :477 


(а) Factor-Saturations. The * saturations’ for the ith factor have been computed in the usual 
way by multiplying the normalized elements specifying the ith latent vector by the square root of 


the ith latent root. Six factors are obtained ; and the resulting matrix is orth а columns 
(Table IV). 8 x is orthogonal by 


?As explained in my previous article, the main object of standardizing the data is to weight the 
frequencies 50 as to compensate for the differences in numbers in the various sub-categories. Here 
an incidental effect is to weight each item according to its difficulty. 


12 


F ) С. Вовт 


TABLE IV. FACTOR-SATURATIONS FOR CATEGORIES (F) 


ES | 
Орао , Fato 6 BI вп виш ву BV Spe 
1a 1-98 4:545 +373 --189 +063 | 1-000 
2a 1:676 2309 —105 -267 —-178 1-000 
За i-6 :000 —258 000 +218 1-000 
Ча L478 —218 —074 +189 —126 1-000 
да 1-267 -244 4-167 —-085 4-028 1-000 
Ih 1267 -244 2467 —085 —028 1-000 
Square-Sum (v) $000 3000 1-000 500 300 200 | 10:000 


(b) Factor-Measurements. То obtain the partial regressions or * weights’ for the several 
categories (J^ — VF") it is only necessary to divide each column of factor-saturations by the square- 
sum shown at the foot. We then apply these weights to the standardized form of the initial matrix 
of marks, The weighted sums yield the values for the several factor-measurements СРЕМ; where 


M = рім). These are given in Table V. 


As I have endeavoured to show in the Appendix, 


factor-measurements can be obtained more directly b: r 
constants for curvilinear regressions. These figures have already been tabulated. Thus if 


the reader will turn to the tables of orthogonal polynomials given by Fisher and Yates,’ and 
take the values of £ for i = 1, 2, . . . 5, normalizing each column by the aid of the square- 
sums printed at the foot, he will find that the figures so obtained are identical with the factor- 
measurements given in Table V (except for the first factor which always has the value 
+ 1/Vn). This set of tables may be used for a battery containing any number of items 


from 2 to 75. 

Taking the factor-saturations as weights, it will be found that we can reconstruct the 
original standardized measurements exactly by the usual equation М = ЕР. With this 
method of factor analysis, therefore, it can no longer be objected that the initial data are 
not reproducible from the factorial results. 


FACTOR-MEASUREMENTS FOR PERSONS (P) 


where the scale pattern is perfect, the 
y the formule used in calculating the 


TABLE V. 

Square- 

Type of Person 5 4 3 2 1 0 Sum 

Factor 408 +408 +408 --408 1:000 
» Bi 1120 —120 —:359 —:598 1-000 
» BII 2436 -436 —109 4-545 1-000 
» ВІ 2,298 4-298 4-522 -- 313 1:000 
» BIV 4-378 4-378 —'567 +189 1-000 
» ВУ 3:630 --630 4:315 —'063 1:000 
1.000 1:000 1:000 1:000 | 6-000 


Square-Sum | 


^ "IC i а derived directly 

Scale Analysis. Dr. Guttman 5 component weights and scores аге д 

from the initi МЕСЕ omputes no coefficients of correlation or contingency, 
the initial table of data. Не сотр * component weights ° (if required) 


nor does he print any matrix of product-sums. The ) у х 
he calculates only айе the * component scores * have been obtained. His threefold theoretical 


5 г ў i . 1 above); and the 
approach has been succinctly summarized in the preceding paper (p. mens 

algebraic usps thas bami (given in his previous publications) bring to light a number 
of new and interesting features. Yet I am doubtful whether, for the ordinary student, 


S i oach and that adopted by Fisher 
1 (16), Table The analogy between this mode of appri t 

m ating STE lies и that an adaptation of his treatment might be used to test 
significance in the case of empirical data : but that is a question I hope to take up 1n a later paper. 


13 


Scale Analysis and Factor Analysis 


the working procedure which the equations suggest would be easier to follow or quicker to 
apply than the ordinary factorial procedure, especially as that can be appreciably simplified 
in the case of an artificially perfect scale. 


(а) Component Scores. The table given by Dr. Guttman as obtained by his own procedure 
is reproduced in Table VI. Like the factor-measurements given in Table V, the scores given by 
Dr. Guttman are orthogonal by factors (i.e., the rows are uncorrelated) ; but, unlike the former, 


they are not orthogonal by persons. His figures in fact are identical with those to be found in the 
table printed by Fisher and Yates 1 ((16), р. 70). 


TABLE VI. SCORES FOR PERSONS (GUTTMAN) 


Persons Square- 
Component 5 4 3 2 1 0 Aum 
Constant | j 1 1 1 1 1 1 6 
І 7 | 5 3 1 --1 =3 =5 70 
П Э, ll -4 -4 езді 5 | 84 
Ш | 5 -7 -4 4 7 -5 | 180 
IV | 1 =3 2 2 =a) 1 | 28 
m M | 1 —5 10 —10 5 —1 252 


(b) Component Weights. Having dealt with the problem of ** quantifying the rank order of 
people," Dr. Guttman turns to that of “ quantifying the category values of items." These values he 
terms the * component weights. The figures that he gives ((18), Table 5, p. 329) are not proportional 
to my figures either for the * factor-saturations ' or for the * regression weights. They are, however, 
indirectly related. We can in fact derive the proportionate values for Dr. Guttman's * weights ’ by 
dividing the saturations for each category (given in Table IV) by the square root of the total number of 
positive entries for that category, that is, multiplying them by 1/4/N. We then arrive at the figures 
shown in Table ҮП. The effect of this procedure is to allot the same weight for the general factor 
to every category in the list, while, with the remaining factors, the figures obtained are now simple 
multiples of the smallest figure in each column. In Dr. Guttman’s table of * principal component 
weights ’ the figures are either identical with these multiples or with simple fractions of them. 


TABLE VII. UNSTANDARDIZED FACTOR-SATURATIONS (U) 


Ib | 447 408 —119 2:109 —075 —:038 —:012 


Category | Multiplier | G BI BIL ВШ ВІУ ВУ 
1а 1/41 = 1:000 408 :598 :545 "373 :189 :063 
2a ۰707 -408 478 -218 --075 --:189 —:126 
3a :577 408 359 -000 149 000 +126 
da | :500 | :408 :239 5109. 22:037 :094 —:063 
ба | 447 | 408 “119 -2109 :075 :038 ۰012 

| 
| 
| 
LU 


He insists that his use of the conventional term * weights’ should not be taken to imply that 
such figures are needed to secure summational equations for computing the persons' scores. Never- 
theless, as he remarks, * it so happens that the equations of internal consistency prove that there is 
a reciprocal relation between principal components for people and principal components for categories 
that does involve an additive (more precisely, an averaging) process, so we are justified in part in 
retaining the traditional nomenclature of * scores ' and ‘ weights.’ " This is in effect a recognition of 
what I have elsewhere called the * reciprocity principle.’ * 


! As Fisher and Yates point out, the figures are proportionate values only, and for convenience the 
figures given by the algebraic formulz have been converted to whole numbers by the fractions shown 
at the foot of each table. As it happens, a table identical with Dr. Guttman's is also printed in the 
analysis of an example used by Mather to illustrate curvilinear regression (Statistical Analysis in 
Biology, 1942, р. 139). The student will find the detailed discussion in Mather's chapter exceedingly 
illuminating, particularly in regard to the analysis of variance. 

2 Cf. (10), pp. 487-94. Іп the notation there used, writing F = LV}, we һауе М = ГИР, where 
LL’ =I and PP'— І, Hence Р = V-3L'M and L= MP'V-i. Thus L can be regarded either 
as normalized factor-saturations (weights) for tests obtained by comultiplying tests or as normalized 
factor-measurements for tests obtained by comultiplying persons (in that case using P'V-3 as 
weights). Similarly there will be two alternative interpretations for P. 


“М 


C. BURT 


Special Relations Arising with Perfect Scales. In discussing the theory of answer 
patterns, several writers have noted certain peculiar features characterizing the marginal 
totals of the columns and rows of ideal patterns like that illustrated in Table I. Definite 
relations are discernible both within and between (a) the series formed by the successive 
totals (weighted or unweighted) that express the * difficulty ” of the tests, (0) the series formed 
by the successive totals (weighted or unweighted) that represent the abilities of the persons 
Or categories of persons, and (c) the series formed by the successive differences obtained 
by subtracting each total from the last. For example, as Walker has pointed out, if totals 
for persons form a linear scale, their first differences must be equal, and “ the distribution 
of the raw scores (і.е., totals for persons) is equal to the first differences of the distribution 
of the number of persons passing each item correctly, so that the scores are completely 
determined by the difficulty values of the items 7: (see (6) and (11), рр. 53f.). Dr. Guttman's 
analysis generalizes these observations for all the components concerned. 

. The whole of his mathematical investigation of the subject (18), pp. 334-61) is exceedingly 
instructive. In particular he shows that, if we examine the artificial type of answer pattern given by 
a ‘perfect scale,’ then, as he puts it in his paper (р. 2 above), we сап discern “ a definite law of 
formation for the entire set of principal components.” Thus, with * scalable’ as with non-scal- 
able ’ data, the principal components are of necessity uncorrelated 5 but, whereas with non-scalable 

data they are in general completely independent, with * scalable " configurations (as he points out) 


“all the components are perfect (curvilinear) functions of the rank order." Іп ordinary factor 


analysi. ith linea sii i s in which a simple * law of formation * may be found have 
lysis (with linear regressions) instance: р! Y 0. 300 and 306, and this Journal, 


often been noted in correlation tables of an artificial type (¢.g., (10), " 

II, Table V, p. 112) ; and with curvilinear factors we should expect the same structural relations 
to obtain as in the case of curvilinear regressions. The values for the latter can be built up from 
the differences 1 by the process used by Fisher ((8), p. 146, cf. (14), П, pp. 164f.)—a process not 
unlike the cumulative summation introduced by G. F. Hardy to facilitate the calculation of moments 
(cf. (4), 1941 ed., р. 442 and refs.). Thus I venture to suggest that the conditions imposed by the 
requirements of a perfect scale could perhaps be most succinctly expressed by the equations worked 
Out for problems involving curvilinear regression, and that the same Cum тоны may be 
used to facilitate the factorial analysis in dealing with data such as that considered above (see 


Appendix, pp. 21-3 below). 


Hierarchical Classification. ТІ 
Dreserved in empirical tables obtained from actu 


ese numerical relations, however, are not likely to be accurately 
Та al tests or questionnaires. On the other hand, the 
t when there are marked irregularities in the 
s in Tables III and VII, it is evident 
‘ genus ’ or * universe * to which the 
items belong. The second (BI) is a bipolar factor distinguishing eau from negatie ten cate 
gories. The third (BII) contrasts the sr tests wi and so on. y Or 
Measurements in Table V classify and subclas Te SUR 
ability, i.e., according to the difficulty of the . pee d y answer. With a 
Summational method of factorization such gradings can only 2 хр b 
S of dichotomous classifications. Нерсе Em automat cchical scheme ^an 
ems and i somewhat artifici : J 

persons according to a SUE analogous to that obtained by 


The curi i ttern which is thus р! 1 1 
applying p Aid of factorization to.a correlation table deu pom ses TOO 
cf. this Journal, П, p. 112, Table VC). However, as In that case (see ibid., ] > 
F : L., ‘ A Note on Tchebycheff's 


* The formula igi to Tchebycheff (1): cf. Isserlis, 8 i 
Interpolation Formula" онаа 1927, pp. 87f., and (5) and 10 a b ite eun. G8) oF 
reached by Dr. Guttman ((18) pp. 343-4) for expressing ше тапор implified case to a difference 
between * weights ° (eqn. (46)) —which (as he points out) reduce of cl 
Squation given by Tchebychefl—are among the most anter aes 307), Ferguson has likewise adopted 
S anot] i oin! e і bipolar factors as 
What is ey ee Ferguson refers to my own ARMEN EU difficulty. 
classificatory principles, and applies it to the factors obtained in analy: i 
he factors so obtained, he says, “ are exemplary om d (Guttman’s first or metric component) 
the first measures the average difficulty of the items ; the secon age difficulty”; and similarly for 
Measures “ the deviation of difficulty of the item from the ayer S BIS types of divergent classifica- 
all the rest ((12), p. 326). This, of course, is not quite the same шы i CS) That requires a series of 
tion described which are termed * hierarchical" by ihe older cillations ateachstage. The ordinary 
subdivided factors” ; and these in turn introduce only An bdi isions ; and this results in one more 
Summational method yields cross-classifications as well B Sue divided” Factors.) 
oscillation at each stage. (See this Journal, П, рр. 41-63, T r, 


Scale Analysis and Factor Analysis 


could, if we preferred, substitute Yule’s triangular mode of factorization : this would show still more 
explicitly that each of the five bipolar factors dealt with five progressive levels of difficulty. I need 
not exemplify this point in detail here, since I have already illustrated it in a previous contribution.’ 


IV. FACTOR ANALYSIS OF THE CORRESPONDING 
ANSWER PATTERN 


Tabulation of Positive Categories Only. І now propose to consider the results that would 
have been obtained had we taken the experimental data embodied in Dr. Guttman's table 
and treated them according to the more usual routine methods. There will be two differences 
at the very outset. First, with two types of answer only, the actual performances would 
ordinarily be entered in a table specifying only 5 items, not 10 categories. Hence the rank 
of the product-sum matrix will be only 5, and we shall now obtain only 5 factors. Secondly, 
for purposes of correlation, the entries would be converted into deviation form. Since 
we have assumed n = N — 1, this reduction will make the number of degrees of freedom 
for persons the same as that for tests, and the effect of the change will be to weight both 
successful and unsuccessful answers according to the difficulty of the items. The resulting 
mark sheet is shown in Table VIII. 


TABLE VIII. MARK SHEET FOR DR. GUTTMAN'S DATA IN UNITARY 
STANDARD MEASURE 


Persons | 

Item 5 4 3 3 А 0 | Square-Sum 

it 5 E 21 = ES 2] | 6х5 

2 4 4 -2 -2 -2 -2 | 6x8 

8 | 3 3 3 3 =3 -3 | 6x9 

4 | 2 2 2 2 -4 -4 6x 8 

5 | І 1 1 1 1 --5 65x75 
Total 15 9 3 m) —9 =15. 6 x 105 


The 5 x 5 correlation table can be factorized by weighted summation in the usual way. 
The saturations and regressions thus obtained are set out in Tables IXA and IXB. 


Factor-Saturations. The process of converting the observed test-marks to deviations 
about the mean has eliminated the former general factor (G), which was really a factor of 
difficulty for the several test-items ; and the new first factor now represents the general 
ability used in solving all such items. The factor-measurements for this factor therefore 
will measure the general ability of the persons tested. 

The relation between the two sets of saturations will be clear if we compare the figures 
infTables VII and IXA. Let us drop the first column in Table VII, and then restandardize 
the remaining figures so that the square-sum is still 1-000. This is equivalent to multiplying 
them by the values shown in the second column of Table LX (headed ‘ multiplier’) ; and 
yields the results shown in the table. The sign pattern remains the same ; and the number 
of oscillations in the five successive rows is now 0, 1, 2, 3, and 4 respectively. 


Factor-Measurements. To find the ordinary linear regression coefficients, each column 
of saturations is, as usual, divided by its square-sum. Then, on applying these coefficients 
(Table IXB) to the standardized test-measurements (obtained by dividing the marks in 
Table VIII by the square root of the square-sum), we obtain the factor-measurements for 
the five factors. It will be discovered that the figures thus computed are precisely the same 
as those obtained by the previous procedure (Table V above). They are, therefore, exactly 
proportional to Dr. Guttman’s ° scores °; and differ from his only by being expressed in 
unitary standard measure. Hi 


ence it may be fairly claimed that, so far as the essential 
results are concerned, the figures re 2; : 5 


S ached by an ordinary factori i identical 
with those reached by scale analysis. 3 4 5; ашон 


* This Journal, IV, р. 76, Table IIB ; cf. also Factors of the Mind, p. 306, Table ПВ. 
16 


C. BURT 


TABLE IX. FACTOR-SATURATIONS AND REGRESSIONS 
م‎ ее — — — ——  — — —— — ————————— 


| A. Saturations B. Regressions 
Item Malis Factor Factor 
| 
Е І n ІШ 1У V. I Il ІП ІУ У 


V6/5 | +6547 + ۰5976 + -4083 + -2073 + -0690 | -2182 + :5976 + -8166 + -6908-- 3451 
V6/4 | -8281 + -3780 — -1291 — -3272 — -2182 | -2760 + -3780 — :2582 — 1-0906 — 1:0911 
V6[3 | -8783 + -0000 — +3651 4 -2928 + -0000 — -7302 + -0000 + 1:5430 
| V6/2 | :8281 — -3780 — -1291 + :3273 - 2182 | -2760 — -3780 — -2582 + 1-0906 — 1:0911 
| МОЙ | -6547 — :5976 + 4083 — -2073 + -0690 | -2182 — :5976 + -8166 — -6908 + “3451 


mA دت‎ о ید‎ 


Square-Sum |3-0000 1:0000 -5000 “3000 -2000 -3333 1:0000 2:0000 3:3333 5-0000 


In virtue of the symmetry that now characterizes each 
hich both saturations and regressions effect a hierarchical 


classification is here a little more obvious. Nevertheless, a set of 5 items and a sample of 6 persons 
do not lend themselves very neatly to classification by means of successive dichotomies. In order to 
exhibit the general principle in its most typical form we require an example with n = 2* items. Let 
us therefore drop the third item in Dr. Guttman’s set of items. This will leave two persons in the 


Same category, but may be all the more instructive on that account : (see top section of Table X). 


ber of positive signs is 
for a battery of n 


Items. On 2 : о 
Correlation EE neni Жер order f plus and minus signs with each 
successive factor. The whole procedure is tantamount to taking as the first set of factor-measurements 
the simple averages of the marks ; then averaging the residuals ; then averaging the residuals from 
this ‘Second set of averages; and so on. It is in fact the method most commonly adopted for 
Partitioning a given set of data in the analysis of variance. 
To keep thi x factors equal to that for items, it will be necessary to normalize the 
vectors by p the eanan aar аа n : E left hand panel of the middle section of Table X. On 
Pre-multiplying the marks for the four remaining tests by this orthogonal matrix of weights we 
obtain the factor-measurements shown in the central panel. To recover the initial matrix of marks 
from the factor-measurements we have merely to take the transpose of the “ item Nein and use 
It to pre-multiply the matrix of factor-measurements, as shown in the last panel of all. 


. Now let us compare the results thus obtained with t E gnorin 

2807 and the facie which have been suppressed). It will be seen that the ‘item weights’ and 
ctor weights’ in Table X have the same sign pattern ах a 

preceding tables, апа that (except for the indeterminate sign оГ, Ше zeros) the * Ѓасіог-теаѕшге- 

ents ' also have th ign pattern as (һең factor-measureme! 1 Ta 

values furnish a e Sible approximation to the ' best fitting values ’ given by the more elaborate 

Procedure. 

ciples of Partition’). Cf. also Burt, (е 

Д The factor me ae 

istically independent of all the others : 

end ? be fulfilled, we 


nor do the figures give the closest possible fit. i 

ould weight eich set of factor-measurements арргориа у before subtract 
caw і і i ethod о i 

уе residuals. For this modified me of Differential Weighting, t 

re obtained by simple sum: 


B 17 


Scale Analysis and Factor Analysis 


TABLE X. FACTOR ANALYSIS BY SIMPLE AVERAGING 


Persons Square- 
Items 5 4 3 2 1 0 Sum 

1 5 ا‎ zi =f =f =! 30 

2 4 Ame 2-2 72 гау 48 

4 2 2 2 2 -4 -4 48 

5 1 1 1 1 1 5 30 

Item Weights Factor-Measurements 

I I ЕОР 6 3 0 0 -% -6 90 

Ir QE ches PESO 3 0 -3 -3 0 3 36 

Ш Е ЕІ 1 0 2-3 0 0 3 0 18 

IV ix IET 1 —2 1 1 =2 1 12 

Factor Weights Initial Scores (Reconstructed) 
HE ПЕСНЕ ТУ 

1 i i i + 5 1 1 1 1 1 30 

2 à $ —$ -4 4 4 -2 2 -2 -2 48 

4 Poi 24 2 2 2 2 -4 -4 48 

5 мА eet 1 1 1 1 1-5 30 
ei NI 


The mode of classifying both items and persons implicitly adopted by summational methods of 
factorization will now be clearer. The first of the four factors—the general factor—has positive 
values for each of the * item weights 7:2 when the values are equal, the effect is virtually to sum the 
correct responses ; it thus specifies, in Dr. Guttman’s phrase, the ‘ metric’ component. The second 
factor contrasts performance in the two harder items with performance in the two easier: in a 
certain sense, therefore, it is, as he says, an ‘ intensity component.’ The third factor contrasts per- 
formance in the extreme items with performance in those of intermediate difficulty : it thus 
corresponds with Dr. Guttman's component of ‘closure.’ If a name were needed for the fourth 
factor, I. would call it an ‘ alternating ' component : Dr. Guttman believes that it ‘portrays a new 
psychological concept? which he terms * involution.’ 

Since the number of oscillations that characterizes each successive factor is primarily an outcome 
of the mathematical technique employed, it would seem better to envisage two ways of labelling 
the factors obtained. (1) First, it would be useful to have purely abstract designations merely indicat- 
ing the formal properties of the several factors. And these designations would presumably differ 
according to our formal interpretation of the procedure. We may, for example, conceive the factors 
either as specifying the means (weighted or unweighted) of the observed measurements and of their 
successive residuals or as specifying the several types of independent comparison. (2) Secondly, we 
need concrete designations giving the material interpretation of the factors in this or that particular 
application ; and the interpretation of course will vary according to the subject matter to which the 
analysis is applied.! From the formal standpoint (which alone concerns us here) the whole pro- 
cedure, as I have suggested elsewhere, is rather like analysing the sounds of a musical instrument 
of complex timbre into a fundamental note with its successive overtones. The ideal test might be 
compared to a relatively pure note with little or no distorting noise ; ananalysis by subdivided factors 
is comparable to a Fourier analysis, where each additional harmonic is fainter than its predecessor 
and has twice the number of oscillations per second ; and the recombination of factor-measurements 
by weighted summation is analogous to the compounding of simple harmonic motions at fight angles 
to each other, as demonstrated (in the two-dimensional case) by Lissajous figures.? 


1 Spearman and Thurstone appear to use the word ‘ factor’ exclusively in the latter sense. Else- 
where I have suggested that the term ‘component’ might be used for the purely mathematical 
‘ factor,’ and the name * factor ’ reserved for the concrete interpretation. E 
2 A still closer analogy is presented by the methods adopted for the * spectral analysis of aggregates 

in quantum physics ; and, in view of the use of complex numbers in quantum theory, it is instructive 
to note that some of the problems arising in the factor analysis of frequency-data may also be 
solved by the use of components involving complex numbers (cf. Burt, C., * The Unit Hierarchy 
and its Properties, Psychometrika, ІП, 1938, рр. 151f. ; also this Journal, IV, 1951, p. 20). In 
observing how the attempt to factorize a pattern of all-or-none responses gives rise to a set of 


oscillating components, it is impossible not to be reminded of the way de Broglie and Schródinger һауе 
attempted to describe electrons in terms of oscillations in multidimensional space. 


18 


C. BURT 


Conclusions, The foregoing examples will, I hope, sufficiently indicate that, with ideal 

tables of the type we have just been considering, a factor analysis can after all “ show what 
а scale analysis will show,” and that the use of the ordinary product-moment coefficient 
may lead to results which are by no means “ misleading," even when the scale is perfect : 
а more general proof is outlined іп the Appendix. But further it may, 1 think, also be claimed 
that a factorial approach such as I have described is particularly suitable with empirical 
data where the test-items do not forma perfect scale. This conclusion, it must be confessed, 
is based chiefly on experience in factorizing dichotomous test-items (like those composing 
Such scales as the Binet-Simon or group-tests of the ordinary type) and qualitative assess- 
ments obtained for physical characteristics and for social conditions. 
1 Now, as I emphasized at the outset, the problems discussed by Dr. Guttman originated 
іп a rather different field. “ The need for scale analysis," he tells us, “ arises out of the 
fundamental problem of attitude scale and opinion polling of how to determine the dimensions 
of meaning which the questions asked have for the respondents : scale analysis affords a. 
Tigorous test for the existence of single-meaning for an area, and provides a rank order of 
individuals for such areas as are found scalable " ((18), p. 88). My own impression is that 
each of the various methods we have discussed has its special advantages and its special 
limitations, and the chief question awaiting solution is to determine still more precisely the 
particular conditions which give the preference to this method or to that. With this aim 
in view, it would, for instance, be instructive to study the results reached by applying the 
factorial methods described above to data obtained from questionnaires on attitudes or 
Opinions, and by using Dr. Guttman's methods of scalogram analysis for data obtained with 
Ordinary pass-or-fail test-items. Meanwhile, there can be no doubt that his intensive 
examination of the problems encountered in attempts to “ quantify attributes has brought 
to light many novel and instructive points, and should, in his phrase, go far to “ stimulate 
rethinking about factor analysis in general." 


V. SUMMARY 


Rm le e purpose ant r 
5 redueibility * pou Dr. Suena. Nevertheless, Dr. Guttman’s more recent and 
mae examination ot the issues eus : 

Tevealed ma d fruitful problems. ) à on 3 

3; Аер O prove the construction and scaling of ШЫ; mel series 
Appear to indicate that for general purposes the most n ke y T 
actorial analysis of the constituent items. It is applicable to е npir қар oe 
Ог perfect, scales. The results indicate a distinction Ren con m eg ШЕ 
actors,’ and demonstrate a reciprocal relation between the dt 


the abilitie М А 
S Of the persons tested: ive factorial technique would appear 
4. For the study of qualitative data the mosi emane due the orthogonal mean square 


to be that origi by Pearson fo i 
Tegression Meee Qu тороо Баш with unreduced test-variances) or, as Pearson 
les 


BI 19 


Scale Analysis and Factor Analysis 


himself preferred to call it, the method of * principal axes.’ Pearson’s extension of regression 
to the determination of curvilinear regression lines suggests a generalization of factorial 
methods to deal with cases in which the relations are not linear but curvilinear. These 
modified methods are applicable both to measurement-data and to frequency-data. Тһе 
use of regression equations of a degree higher than the first is analogous to the use of higher 
moments, and, with qualitative data, leads to the consideration of frequencies of higher 
order than the second. With ideal answer patterns the resulting factor-measurements (the 
latent vectors for persons) are proportional to the values tabulated by Fisher and Yates for 
orthogonal polynomials. It would seem therefore that Dr. Guttman's objections to the use 
of factor analysis with qualitative data may thus be fully met. 

5. А factor analysis of Dr. Guttman’s ideal ‘ scale pattern’ (which includes negative 
as well as positive categories) yields figures for the factor-measurements which are identical 
with Dr. Guttman's * component scores’ when the latter are reduced to normalized form. 
е factor-saturations are related to his ‘ component weights,’ but here the relation is less 

irect. 

6. Ifan answer pattern of the more usual type (i.e., one which includes positive categories 
only) is substituted for the ideal ‘ scale pattern,’ a factor analysis of the product-moment 
correlations still produces precisely the same factor-measurements. These factor-measure- 
ments exhibit a succession of dichotomous classifications, sub-classifications, and cross- 
classifications according to a hierarchical scheme, which is described by the sign pattern 
of the saturations and measurements. For practical work involving the analysis of empirical 
answer patterns certain abridged methods are available. 


REFERENCES 


. Tchebycheff, P. L. (1859). ‘Sur l'interpolation." 
541-60. St.-Petersbourg. 


- 


Reprinted іп Œuvres (1907), I, 478-98, 


2. Pearson, К. (1901). “ On lines and planes of closest fit to a system of points in space.’ Phil. 
Mag., M, 6th Ser., 559-72. 
3. Yule, G. U. (1910). An Introduction to the Theory of Statistics. London : Griffin. 
4. Burt, C. (1921 : 2nd ed. 1947). Mental and Scholastic Tests. London : Staples Press. 
5. Allan, Е. E. (1930). “Тһе general form of the orthogonal polynomials for simple series with 
proofs of their simple properties. Proc. Roy. Soc. Edin., L, 310-23. E 
6. Walker, D. А. (1931, 1936, 1940). * Answer pattern and score scatter іп tests and examinations. 
Brit. J. Psych., XXII, 73-86, XXVI, 301-8, XXX, 248-60. ; 
7. Greenleaf, H. E. H. (1932). ‘Curve approximation by functions analogous to the Hermite 
.polynomials.' Ann. Math. Stats., ІП, 204-17. 
8. Aitken, А. C. (1933). “Оп fitting polynomials to weighted data by least squares.’ Proc. Roy. 
Soc. Edin., LIV, 1-17. 
9. SUED R. A. (1934). Statistical Methods for Research Workers (5th ed.). Edinburgh : Oliver & 
oyd. 
10. Burt, C. (1940). The Factors of the Mind. London: University of London Press. 
11. Ferguson, G. A. (1941). The Reliability of Mental Tests. London: University of London 


Press. 
12. Ferguson, G. A. (1941). “Тһе factorial interpretation of test difficulty.’ Psychometrika, V1, 
323-9 


. Burt, C., and John E. (1942). “А factor analysis of the Terman-Binet tests.’ Brit. J. Educ. 
Psychol., XII, 117-21, 156-61. 
14. Kendall, M. G. (1946). Advanced Theory of Statistics. London: Griffin. 
15. Cramér, H. (1946). Mathematical Methods of Statistics. Princeton : University Press. 
. Fisher, R. A., and Yates, Е. (1948). Statistical Tables. Edinburgh : Oliver and Boyd. 
Š HUE “Тһе factorial analysis of qualitative data.’ Brit. J. Psych., Stat. Sect., ш, 


18. Stouffer, 5. A., et al. (1950). Measurement and Prediction. Princeton: University Press. 


i uis S aD: * Test construction and the scaling of items.’ Brit. J. Psych., Stat. Sect., 


20 


C. Burt 


APPENDIX 


Let n be the number of items in the scale, and N the number of persons tested. According to 
Dr. Guttman, if the scale is perfect, the number of categories of persons between which it can dis- 
criminate will be n+ 1. Let us therefore assume № = n + 1. Letjork(j, k —1,2,... n) be 
used to indicate the code-number assigned to the several items when arranged in order of difficulty, 
anda (а= n, n — 1,.. . , 1,0) be the code-numberassigned to the persons when arranged in order of 
ability : so that л denotes the easiest test and the ablest person. Finally, let xaz stand forthe mark ж 
obtained by the ath person in the kth item. 

In a perfect scale, after the marks have been reduced to deviation form, there will be, for the . 
kth item, k positive marks of (n — k + 1), and (п — k + 1) marks of —k. For each item the mean 
will therefore be m = 0 ; the square-sum (gr say) will be (п + 1) (п — k + 1) k, and the variance 
(cx) will be (n — k + 1) k. Using the familiar formule for the sum of the first n numbers and 
for the sum of their squares, it is then easy to show that the sum of the variances, 510%, will be 


1 2 
g” (n + 1) (n + 2), ana that the total mark for the ath person will be ta = xar = 3 (n + 1) Qa — п), 


and his mean mark therefore x (n + 1) Qa — n). The mean marks for the several categories of 
Persons will consequently constitute a perfect linear scale diminishing by regular steps of z =. i. 
and their variance, сз, will be (E203 Dr 

-sum matrix by erase multiplying не пто what 1 have called an 
and call the total for the kth item D pits 


„ Let us now form a product 
Of items, The result will.be, not a sy! 0 
isoclinal ! matrix. If we add each column of this matrix, 


we shall have? Spx = — (п + 955-і (n+ 13k (n — k + 1) = 3 (n + 0g = } (n+ 1)? oè. 
F j=o 


m Pik 
by the usual product-moment formula, rj = 1) от" 


If we proceed to factorize the matrix of correlations by the method of weighted summation, then 
(as may readily be shown) the saturations for the first factor will be 
Ok (i) 


fem ar ea 


/ R= vy f^ (where v, denotes the factor-variance 
ahs weight i columns of the matrix). The factor- 


х/з- m+ On 


4 пс? 


For the correlations we have, 


We can verify thi i basic formul 
y this by applying the basic. 

and f’, the row-vector of factor-saturations use 

Variance, vı, will be given by the sum of the squares of the saturations, 


i ion reduces to 3 (n + 1. Now, 
A Ы by the appropriate saturation 


ues in the kth 


Substituting the values already reached for cx and om, 
When we premultiply the jth element in the kth column О ee 
i б; Pik — Ph. . Summing all such vai 
ofj, we obtain fre = 3 (n + 1) "dh ‘tijo 20nd 


Column, we have 
1 


Ee ^ 2n6mo% 
tal (n+ 1)? oe 


өл хал 
2пбтбь 


СА у 
(а-ы D. hat D qon) = vfo 


Ep 


Gi) 
i) co to the requirements of the 

Hence the values calculated for the saturations by eqn. (i) conform 

‘Aturations for the principal axes as stated by Pearson 


See thi ; i ; are numbered 
3 dh un a nih fast e Dr. Guttman’s notation, the persons 
backwards, ign is due to , y 


Scale Analysis and Factor Analysis 


The regressions (wz), as usual, are obtained by dividing the corresponding saturations by v 
We thus have и; = 1 2 oa - Тһе factor-measurement for the ath person will accordingly be 


obtained by taking a weighted sum of his standardized marks, viz., 


Xok у (ск, a! Xxa 
Zw. V qr > Uno» ovin 1) nyn + Dom 
EE 3 ) E. 
А, CEDER (2a — n). (iii) 


The sum of the squares of (2a — n) for a = 0, 1, 2,...n will be 4Xia? — 4пЎа + (n +.1)n? = 
jn + 1) (п--2). Hence the effect of the expression under the radical is simply to reduce the values 


of Qa — n) to standard measure ; and these values will evidently form a perfect arithmetical pro- 
gression decreasing by a constant difference of 22- 

It follows that a factorial analysis applied to the standardized marks is fully capable о 
demonstrating whether or not the marks obtained by the several items form a perfect scale in 
Dr. Guttman’s sense. To measure the degree to which any empirical set of marks approximates 
to a perfect scale, we could, if we wished, correlate the factor-measurements with the ranks of the 
several persons (or categories of Persons). If we are not interested in determining the rest of the 
factors, it would for most Purposes be sufficient to correlate the unweighted totals, without performing 
a complete factorial analysis to obtain the most appropriate least-squares weights. 

Algebraic expressions for the remaining factors can be reached along similar lines. The order 
of the curve required to give a perfect fit is the same as the number of degrees of freedom available 
for the differences between the observed values, namely, N — 1 = т. This will therefore be the 
order of the polynomial expression for the nth bipolar factor, i.e., for the last. 

Let us reduce the rank order for the persons to deviation form, and write xo; = 1 (i = 1,2,... М) 
for the general factor-measurements (which are the same for all persons), and xy; = a; — 4 (N —1) 
for the first factor-measurements. Then the formule for all the factor-measurements сап be 


expressed algebraically as follows." I add in each case the numerical equations so given for the data 
examined in the preceding pages, where N = 6 (cf. Table V). 


chy — At UN Xe 
G. o VO" 
fy Jb Xii T» 
ene MIO: 
zn JD З (CEST 
II. vizio 12 (N? I) + ха = 4771—15 1 x 5 
E 1 1 ) 5 101 | 
m. gogl DN- Dx tx} = 0-01, а). 
x. 17 |22 a 2 ا‎ | 2 deu 
у. B- Lie — 1) (wt 9) jj GN? — 13) x4 + x4] 
VIST 95 11 
А 116: Ta 
c ЕТЕ ME G [US 24 — 5 (мз mx 
Va 522211006 (15N: 230N? 4- 407) x, 18 (У 7)х2; + x 1j 
|1567. . 145 . „ыў 
=.20 11008 1 Tg "it ray . 
The values for сһ = Vix? (a = 1,2, ... №), the constant used to reduce the values to unitary 


standard measure, can be obtained by the following equation : 


У = оу Es pr NON? — 1) (№2 =28) .. . (N* = H°). (iv) 


It will be seen that, although the five sets of factor-measurements are ‘uncorrelated, they can never- 
theless all be expressed as polynomial functions of one factor only, namely, the first. The order of the 
Polynomial increases step by Step ; the even factors contain only even powers, while the odd contain 


" The unstandardized values, denoted by x, and given within the brackets, are proportional to, 
but not identical with, those given by Dr. Guttman, which give proportionate values in terms of the 
smallest integral numbers, like those tabulated by Fisher and Yates ((16), pp. 70f.). To obtain their 
tabulated figures, the above values for x; must be multiplied by the fraction at the foot of their tables. 


22 


— ——— ———— 


C. BURT 


only odd powers: (hence the alternation between symmetry and reversal in the signs of the 

Successive rows of Table V). For practical work the investigator may either take the values from the 

pauca or orthogonal polynomials, or, if these are not available, build them up by means of the follow- 
ng relation ! : 


Exh 
Хара = Алы — bx», Where b = ^w 


For any factor the saturation for the jth item (/j) can be obtained from the factor-measure- 


ments (pj, by a process of cumulative summation, using the equation fj = Р X Spo; 


the summation extending to the first j measurements : e.g., for factor BI, taking the figures from line 2 j 
of Table V, f, = lq x C598 + :359) = -866 x - 957 = -8281, as given in line 2 of Table IXA. 


+ 

The factor-variances аге given by the equations ; 
n+1 n+1 елді ТЕСЕ! 

Yi 1.2 ^" 2.3 I 5o mii 


Their sum is 
1 1 


1 1 
Xm = (п + D Tate: i REED) 
1 1 hr 
=m+nla-pta-9t-.-+ (boa 1) =+ 0)1 - (= 
1 The relation is due to Tchebycheff (1). Cf. also (5) and Isserlis, L., ‘ A Note on Tchebycheff’s 
Interpolation Formula,’ Biometrika, ХІХ, 1927, рр. 87f. 


23 


Vol. VI The British Journal of Statistical Psychology Ма 
Part I 1957 


COVARIANCE ANALYSIS 
AND ITS APPLICATIONS IN 
PSYCHOLOGICAL RESEARCH 


By NEIL GOURLAY 
Department of Education, University of Birmingham 


I. Problem. II. Main Uses of Covariance Analysis. ІП. Comparison of the 
Precision of Covariance with that of other Procedures. ТУ. Summary and Conclusions. 


I. PROBLEM 


Fisher’s analysis of variance is now а commonplace in the field of educational 


and psychological research : its principles and methods of application are expounded 
in all modern statistical text-books. But, when we turn to its correlate, the analysis 
of covariance, it is probably true to say that very few text-books provide an adequate 
treatment of the subject, although some give an excellent account of a particular 
aspect, and that none gives a complete and critical discussion of its uses in psycho- 
logy. The wide variation in approach (compare, for example, Lindquist (1), Snedecor 
(2), McNemar (3), and Kendall (4)) is itself a source of confusion to all but systematic 


Students of statistical method. 4 TE : 
This paper therefore attempts to review the main uses of the analysis of covari- 
ions in psychological research. Its primary 


ance with special reference to its applications 1 f 
object will be to show that, although the statistical analysis remains more or less 
the same in all cases, the interpretations of the results must vary greatly with the 
conditions under which the data were obtained. Further, in indicating the correct 
interpretations for different experimental situations, the opportunity will be taken 
of pointing out some of the wrong in ions made in applying the technique. 
A. comparison will also be made of the precision of covariance with that of other 
methods ; and it will be shown that there are freque 
technique can be employed with little loss in precision an 


in the amount of com ional lab Bartlett (5)* has published an excellent but 
А putational labour. i 5 ins 
rief paper on covariance analysis. in a journal which is not 


dil But this append eur 
Teadily accessible to the average educational and PSYC ological t Lig 
Was written mainly for the agricultural experimenter ; moreover, the misinterpreta- 
tions of which Bartlett speaks are still to be found. 


II. MAIN USES OF COVARIANCE ANALYSIS 


sis of covariance can 
which any particular 
mber of groups or 
samples from the 
e variates, whether or not 


, 


every hypothesis put for- 


is of covariance had taken shape 
PM bted to him for certain 


Т am also indebted, c 
script and take this opportunity of 


E ys be regarded as a means 0 1 
Tite varies with respect to other variat 
E Pulations. Covariance 15 

me or different populations are measure 

€ measurements are made at the same tme. 
Ж 

I have t i ideas on the an 

о acknowledge that, while most of my i J 

ов Tread Bartlett's paper, their final form owes much iE 
comments he has been good enough to m е on my 

Pressing my thanks. 

25 


However, 


Covariance Analysis and its Applications in Psychological Research 


ward for testing requires that some set of statistical conditions shall be satisfied 
before the technique can be validly applied. These differ with the different types of 


hypothesis, and are not always clearly defined in the text-books. They will be 


stated as explicitly as possible in the course of the following discussion. 


In this section the various applications will be considered with reference to three types 
of case : 


(a) where covariance is used to increase the precision of an experiment ; 

(6) where it is a means of analysing the relation between variates which have been 
measured after treatment of experimental groups ; 

(c) where it is applied to data obtained from non-random groups. 


For simplicity, the discussion will be confined to the case of two variates only, i.e., we 
shall consider the variation of one variate, the dependent, relative only to one other, the 
independent, variate. This is by far the commonest situation in educational and psycho- 


logical research ; and a similar argument holds when more than one independent variate 
is involved. 


(а) The Analysis of Covariance as a Means of Increasing the Precision of an Experiment. 
This is probably the most important function that the analysis of covariance can fulfil in 
educational and psychological research. It is the only application considered by Fisher and 
Lindquist in their text-books ((6), (7), and (1)) : and it is one of the two discussed by Edwards (8). 
On the other hand, it receives scant treatment from Snedecor (2), and is not treated at all 
in many other text-books, e.g., Johnson’s (9). Johnson admittedly, in one of his examples, 
speaks of covariance as “ increasing the precision of the experimental comparisons’ ; but, 
as I shall point out later, the phrase is not legitimate in such a context. 

When one speaks of increasing the precision of an experiment by the use of covariance 
analysis, it is evidently implied that this increase is relative to the precision obtained by other 
methods of analysis. Here we shall compare covariance with the simple analysis of variance 
only. For ease of exposition, let us make the comparison in terms of a simple experiment 
designed to compare the differential effects of, say, two or more methods of teaching a topic 
in some school subject. For both techniques the experimental procedure is the same except 
in one respect : in particular it is essential that the experimental groups shall form random 
samples from. the given population. But, when using covariance, besides applying an 
appropriate criterion as a test at the conclusion of the experiment, it is also necessary that the 
groups shall be tested prior to the experiment and that the test adopted shall correlate well 
with the criterion. Often the same test is used, thus ensuring the maximum correlation. In 
the covariance analysis the scores in the initial and final testing become the values for the 
independent and dependent variate respectively. The two variates may be conveniently 
referred to as the initial and final variates. 

The null hypothesis is the same for both analyses : otherwise it would be meaningless to 
talk of greater precision in the case of covariance. A simple statement of the null hypothesis 
is that no significant difference exists between the methods of teaching. Alternatively, we 
may say that the scores for the experimental groups in the criterion-test are random samples 
from the same population ; or, less generally, that the population means for the two groups 
working under different teaching methods are equal, i.e., uy = Ug, where uj, and [gy are 
the population means for the pth and gth methods groups in the criterion-test (y). 

An analysis of variance tests the hypothesis by providing two independent estimates of the 
common population variance—the ‘ between methods’ and * within methods’ variances based on 
(S — 1) and (М — S) degrees of freedom respectively, where 5 is the number of methods and N is 
the total number of subjects employed in the experiment. The variances are tested for heterogeneity 
by means of Snedecor’s F-test or Fisher’s z-test. Necessary assumptions in making the test are 
that the scores * within methods * are normally (or approximately normally) distributed and that the 


methods do not give rise to significant differences in the variances for the different experimental 
groups (the condition of homogeneity of variance).* 


1 1 H 277... H 

Although in this paper I have some criticism to make of Johnson's treatment of the analysis of 
feats for те, (9), Біз rea tment of the analysis of variance is to be recommended in that he АСТАР 
inconsistently, he ane variance—a precaution that many investigators often overlook. But, rathe! 


z not consider i i i i an 
analysis of covariance. it necessary to test the underlying assumptions required for 


26 


N. GOURLAY 


An analysis of covariance, on the other hand, utilizing the fact that the differences between 
final scores are to some extent a reflection of differences between the initial scores, removes the 
variance due to these initial differences from the final variances. It thus reduces the size of the error 
variance, and so increases the precision of the experiment. The algebraic details are given in the 
text-books (e.g., (1), pp. 191-3). Briefly the steps are : 


1. An analysis of the sums of squares for the initial and final scores (i.e., the independent and 
the dependent variate respectively). 

2. An analysis of the sum of products for the two variates. 

3. Calculation of the residual sum of squares for the dependent variate (a) when the total 
sums of squares and products are used to fit a regression line to the total data and (b) when 
the within methods scores of squares and products are used to fit a regression line of common 
slope to each of the methods groups. 

4. The subtraction of the residual sum of squares for within methods from the total residual 
sum of squares to give the between methods residual sum of squares. 


One degree of freedom is lost in fitting the regression line; and the results of the analysis сап 


therefore be tabulated as follows : 
Dll 000 I ———— 


Residual sums of squares d.f. 
Between methods .. T dt 6-1 
Within methods — .. va SD 
Total sim зе Ag ai N-2 


——————— 


. . On the null hypothesis the residual sums of squares between methods and within methods provide 
independent estis of the same variance, namely, the residual variance in the parent population. 
ence the hypothesis is still tested by the F-test. The underlying assumptions are clearly stated by 


Lindquist ((1), pp. 195, 196): 
(i) The methods groups must be random samples fom the same population; 
(ii) The regression of final on initial scores must be linear ; , 
(іі) Тһе minem coefficients must. not differ significantly from group to group ; . i 
(iv) Within any group the dependent variate must be distributed normally for any given value 
of the independent variate, and have a constant variance, the same for all groups, namely, 
the residual variance already mentioned. 


For each of these assumptions tests are ayailable ; and, stri 
Part of any analysis of covariance, The test for condition (iii) is give 


and Jacks homogeneity of residual variance is sim t 
or variance : M nie crite BESTE in covariance analysis will be found in Jackson ((10), pp. 


15-7). No assumption need be made about the distribution of the indepandan аар TE 
Consequently the condition of normality is not necessary (condition (iv) app! 


variate), 5 A 

What is the actual gain in precision achieved by covariance 2 p a simple 
analysis of variance? Since precision is measured by fe Nae сао о. ‘The ratiolof 
(7), p. 182), the question can be answered by compare = со product-moment correla- 
d € two precisions proves to be very nearly 1 ne Dur Wat the residual sum of squares 
lon within groups. This follows from the well-kn 2 where Sy? is the sum of 


ithi e Syel — r’) s 
Within groups for the dependent variate r a eE T the residual sum of squares is 


Squares withi ariate. арага E 
ased on Wh гог tha! compared with (М — 5) df, for the sum of sane аа 
groups,” it will be seen that the exact value of the pr 
M d i пао not ran- 
io aN пох where onc or mar elt t ONS es al holds 
domized, the MIO. variance may be an interaction term d SE for the interaction term. 
cept that r is now derived from the sums of squares and р oducts don a small number 0 i 
Ten a Rotel, however that gi f covariance may considerably 
r edom, the | 
educe the gain in precision. A 


strictly speaking, they should form 
4 B Kendall (4), Snedecor (2), 
ilar to that used in the analysis 


Covariance Analysis and its Applications in Psychological Research 


(6) The Use of Covariance to Study the Relation between Variates Measured after 
Treatment of Experimental Groups. This is the application with which Snedecor (2) is 
mainly concerned ; it is briefly considered by Edwards (8). 

Here we are again dealing with an experimental situation, i.e., one where principles of 
randomization, statistical control, etc., are deliberately introduced. But now both variates 
are measured after the application of the experimental treatments : that is, both may be 
affected by the treatments, and thus show significant differences between groups. 


Since both are now * final * variates, it would appear arbitrary which we choose as the indepen- 
dent variate ; but in practice the choice is determined by the question which has prompted the 
analysis. Four possibilities arise 1 :— 


(i) We may know on a priori grounds that there are no differences between treatments for the 
independent variate ; 
(ii) Significant differences between groups are found for the independent variate 2; 
(iii) We may know a priori that there are real differences between treatments for the independent 
variate, although a test of group differences in this variate does not prove significant ; 
(iv) We may have no a priori knowledge about the differences between treatments for the 
independent variate ; but, on testing them, the differences are found to be non-significant. 


Іп case (i) the situation is the same as that discussed in subsection (а): and іп applying the 
analysis of covariance we are (as before) obtaining a more precise test. To illustrate from the 
educational field, one could carry out an experiment on method, e.g., in the teaching of Arithmetic, 
and at the end of the experiment give tests in both Arithmetic and Intelligence to the groups. Then, 
if one could say a priori that there was no difference between the groups in intelligence (and the data 
supported this), the scores for the Intelligence test could be used to improve precision in testing the 
differences between groups in the Arithmetic test. This is not the best procedure. As already stated, 
maximum precision is obtained by testing the groups initially with a test which has a high correlation 
with the final test. However, a situation might arise in which an investigator through ignorance 
failed to set an initial test, and a statistician called in later tried to remedy the position by setting a 
test at the end which correlated well with the final test, and for which there was known to be no 
difference between groups. 

Turning to case (ii), namely, that in which there are significant differences between groups in 
the independent variate, we shall assume that there are also ‘significant group differences in the 
dependent variate : otherwise there would be little point in using covariance. That is the situation 
always assumed in text-books concerned with this type of application. Since in this case the statis- 
tical details are straightforward and trouble arises only when less technical descriptions are attempted, 
it will be advantageous to consider the former first. The algebraic procedure is the same as for 
subsection (a); and the same variances are involved in the F-test. The necessary assumptions 
(p. 27) are also the same. But the hypothesis tested is different. 

We can say that what we are testing is the hypothesis that the regression line (for the dependent 
variate on the independent variate) is the same line (though not necessarily of the same gradient) 
for all treatments ; or that for a given value of the independent variate the estimated value of the 
dependent variate is the same for all treatments ; or again that the means of the bivariate populations 
corresponding to the different groups are collinear, the line of collinearity being the common regression 
line. Algebraically, if pz, ppv and Laz, роу аге the population means for any two treatments p 
and q (x and y representing the independent and dependent variates respectively), the hypothesis 
is that (upy — uar) /(Upz — Шаг) = В, where 9 is the coefficient of regression in the common population. 

It must be emphasized that this is not the same as the hypothesis involved in subsection (a), 
where upz = ua: by virtue of the design of the experiment, and the hypothesis to be tested is simply 
Uy = Ugy—the same hypothesis as is tested by the simple analysis of variance. It follows from 
the definition of the term * precision ” that this application of covariance cannot be regarded as a case 
in which the purpose is to increase precision. If the expression ‘ increase of precision ” is to have 
any meaning, the hypothesis tested must be the same when the precision technique is used as when 
it is not.* 


1 As in subsection (a), it will be convenient to speak in terms of a simple experiment on methods 

of treatment. " 

* If an analysis of variance is used to test differences between groups in the independent variate, 

the minimum assumptions for an analysis of covariance (p. 27) must be supplemented by the assump- 

tion of normality (or approximate normality) within groups for the independent variate. Both 

yariates must then be normally distributed. 

5 Mathematically the hypothesis of subsection (2) may be considered as a limiting case of the other. 
I therefore disagree with Snedecor ((2), p. 323) when he regards ‘ increase of precision’ as one of the 


features of this application. The remarkable thing is that later (pp. 334, 335) he appears to be quite 
clear about the distinction. 


28 


М. GOURLAY 


Let us now attempt a less technical description of the hypothesis stated above. Still 
keeping to statistical terms, we can say that we are seeking to discover whether the relation 
which the dependent variate bears to the independent variate is the same between groups 
as within groups ; more loosely, whether the * between group " differences in the dependent 
variate are “ adequately accounted for ” (as the text-books sometimes put it) by the * between 
group ’ differences in the independent variate. I give two examples. 


(1) Two groups of young children, backward mentally and educationally, are assigned at random 
to two different kinds of environment. Some years later they are tested for intelligence and educational 
attainment. To determine whether a difference between the groups in attainment is accounted for 
by their difference in intelligence, an analysis of covariance might be suitably used. 

(2) In an experiment to compare methods of teaching composition, the experimental groups 
may be tested at the end of the experiment for ability in composition and in the mechanical aspects 
of English. Once again we might use covariance to discover whether group differences in the second 


variate ‘explain’ the group differences in the first. 

„There is. yet another popular way of expressing the purpose in this case. We can speak of 
testing for differences between groups in the dependent variate after the latter have been * corrected ’ 
for differences between groups in the independent variate, or, as Snedecor put it ((2), р. 320), after 
* due allowance is made for’ such differences. So long as such expressions are understood to mean 


no more than is contained in the previous statistical statements, they are permissible. There is one 
interpretation which must not be made. It must not be thought that, when covariance is applied 
to make “ due allowance for " the independent variate, we are then testing the same hypothesis as 
would be involved in an experiment where it was arranged experimentally that there was no difference 


between groups in that variate." і 
. . Case (iii) is easily dismissed. Тһе interpretation of an analysis ofc 
is рае as for case (ii). ES i d а 
ase (iv) might at first sight be put with case бі). ut, if one cons! ers the c 
normally Eos od groups ан по Šgnificant differences for either variate but a significant overall 
difference when a A-test is applied," it will be seen at once that the identification would be wrong. 
The significant overall difference must be due to real differences between groups in either the one 
variate or the other or in both ; and the data do not allow us to accept one of these alternatives and 
reject the others. It follows that we.cannot speak of using the one variate as the independ m variate 
of a covariance analysis in order to increase the precision in testing differences as ER 52 othe: 
variate, Hence case (iv) cannot be identified with case G), but should rather be interpreted along the 


lines adopted for case (ii). 


ovariance for this situation 


(с) The Analysis of Covariance Applied 10 Non-Random Groups. ln the previous 
applications the ode were random samples from the same population. In a third type 
9 application they may be non-random ог (to use Гараша зенне сш шох b Ec. 
rom educational enquiries are the pupils in à school or grade or . 
be measured at the qune time or at diffe dall (4), Johnson (9), McNemar 
, and Jack: 
сара. aspect first. The algebraic procedure 


As before, we will consider the purely statistical between the three types. Further, the 
pt that the condition of randomness no E 
ШЕТ КЫ rd омо RET m LEA 22 

an in the previous cases. The researches quoted by Li uis (D, PP. conelation i 


rent times (cf. Ken 


1188051 that, i ing with intact school groups, heterogeneit a ni 
е rule dar do RS he same is probably true of regression. на te 
: homogeneity of regression an 1 


From subsections (а) and (0), 


5 S 2 à 
What statistical hypothesis is involved іп this (ур Iready stated for subsection (b), 


the read : lid hypothesis is that alread 11 ee 
case a. will see that tie ade eel be no difficulty in examining the validity о 


ton method on An analysis of covariance which 


imeni е 
aie truction time. 


CAS 
illustr: i i i i i le е = 
ation of this point is a simp mof ins 


groups do not recei о n It (on appl 
receive the same total am! 5 е the same resu pplyi 
48 for ihe differences in times il aog pel fhe reader should be able to give the reason 
made Ч 


ап experiment where the times were 


Why this j 

2 5 50. E 

а Bartlett (5) does not appear to differentiate bet 
ог an example of the use of the A-test, see 


een cases (i) and (ім). 
this Journal (11). 
29 


Covariance Analysis and its Applications in Psychological Research 


less technical interpretations. Consider first whether covariance can be used to compar 
treatments, methods, etc., when applied to non-random groups instead of random samples. 


An educational investigator sometimes supposes that, if he uses initial scores (or intelligence- 


test scores) as the independent variate, he can validly apply covariance to, non-random 
groups. But, although the analysis makes an allowance for differences in initial scores or 


in intelligence-test scores (or in both), there is no guarantee that there may not be other 
differences between the groups operating to produce final differences which will be unattribut- 
able to the treatments. Unless other evidence is available, the danger of drawing а. false 
conclusion is equally great when there are no differences between groups in the initial or 
final tests. Kendall's example ((4), рр. 240-2) illustrates this point. In his example the 
groups show no significant differences in the initial test but significant .differences in the 
final test after being subjected to the same treatment.* 


It follows that this third type of application, like case (ii), cannot be regarded as a method 
of increasing precision. I would therefore disagree with Johnson ((9), pp. 310-24) when he 
describes his use of covariance as ‘ increasing the precision of the experimental comparisons. 


Obviously, since the same statistical hypothesis is involved as in subsection (0), case (ii), 
the interpretations which were suggested there can also be made for the present case. There 
is no need to repeat them. This does not е 


xhaust the possibilities of this particular type. 
Kendall's example shows the value of covariance in prediction problems. In that example 
“ a number of recruits are given a preliminary test to asce; 


rtain their suitability for a certain 
course of training ; at the end of the training they undergo a proficiency test," One of the 
issues which the analysis is intended to decide is “ whether the-same rejection standards 
in the preliminary tests can be applied to all recruits, irrespective of their town of origin." 
The reader will note that, whatever specific purpose covariance is made to serve, the general 
purpose is always that of looking for homogeneity or heterogeneity—as in any analysis 


of variance. Jackson ((10), p. 74) expresses the same idea when he says that covariance decides 
whether the groups can be combined for the tests involved.? 


III. COMPARISON OF THE PRECISION OF 


COVARIANCE WITH THAT OF OTHER 
PROCEDURES 


In the preceding section I have endeavoured to indicate in what cases an analysis of 
covariance may be validly said to yield increased precision, and how the precision of covari- 


ance compares with that of a simple analysis of variance. I now propose to compare it with 
two other statistical procedures. 


(a) Matching Techniques. Consider again a simple experiment on methods. It is 
well known that precision can еі EE by matching the pupils, prior to the experiment, 
in one or more tests—in pairs if only two methods are to be compared, in threes if (as is 
somewhat rare) three methods are involved, and so on. The DUE. in each matched 8 
1 The first sentence of th 3 i 
M tence of the second paragraph of McNemar's chapter on covariance (3) suggests that 
? Kendall states (р. 243) that, by using the analysis 
бді the asia eee alone on the groups. 

etween groups will be due partly to the effects i : 
iS conten ренеп the preliminary and final oe pot partiy to any difference 
: А d pattie of Jackson's analysis ((10), p. 80) is that, in testing for differe Ud i 
D tl eon after E for difference E сааса 
rom the residual sum of squares be Classe: y 
from the sum of squares dus S dene ure ro oe naed on (re 


hether he had this in mind. 
30 


М. GOURLAY 


are then assigned at random to one or other of the methods. For the final s i 
2 1 ; inal scores the analysis 
of variance will be tabulated as follows : Ч 


Sums of Squares а.Ғ 
Between methods dá "d pl 
Between matched groups .. n—l 
Error .. A3 ЭА - qu Xp) 
Тоа Е : go 2% np - 1 


lot ee 


machas is far from perfect, particularly if more than two me 
of pupils is limited. Hence, on this count alone, the matching met 


Similar remarks apply to experiments on factorial design where other classifications besides that of 
thods, replicated in several schools, 


Кезітпепіе enter into the analysis. Thus, for an experiment on mel ate UTE 
Samb ортоп might perhaps be improved by matching the methods groups within schools ; iE А 
cla: ject can be achieved more easily and effectively by using random groups or even intac 
sses, and then applying an analysis of covariance. 
D КО] Analysis of Variance of Gains (or Differences between Initial and 
C of Experiments ((7), pp. 166-9) Fisher discusses certain alterna 
of covariance that have been used by investigators, namely : Whee 
(i) an analysis of the variance of absolute gains of final over initial measurements ; 
(ii) an analysis of the variance of such gains expressed as а fraction or percentage of the 


m ‘initial measurements. E, wa 
е recognizing that, in the usual agricultural experiment, suc methods may not give 
Tesults greatly different from an analysis of covariance, he points out that they entail arbitrary 
Corrections for initial differences, and consequently never achieve the precision of a covariance 
analysis. Further, in the absence of other evidence, the experimenter has no assurance that 
е corrections are appropriate to his data, nor can he tell how much precision he has lost 
y employing them. qued А 
Fisher does п thods. And accordingly I propose 10 
i ot completely condemn these methoc С m 
tiguider the first of eom and indicate (for experiments in the educational nnd sort Ан 
én ns must be satisfied to obtain adequate results, 1e results which do not fall short o te 
ecision secured by an analysis of covariance. , Е T 
+S unl conditons of noma (or amore ssa o e a б 
апапсе will again hold . but in this case the с 0 
АС ОАА oy ТІ 
е Variance “йі ces Y — Ж. е hyp! ; асса > 
n € same as for i Pe Variance (of final scores) OF STR this. A 
v — u: (in the usual notation) is the same à 


for all groups ; 
thesis that uy is the same for all groups (since p= iS the same for all groups as a result of the 
Xperimental design). 


Let o,*; сз lation valu 
x 2, o, and p denote the popu 

ы ШЕН variates and of the product-moment corre 

wit Eason in the case of the covariance алауын 5 

Gb салар, уйы erage, has the P^ i e variance within 
In the case ‘of Ұшан, on UE e e of differences We mu i ipie the ie a 
groups which will have, on the average, the population value o -z = 9 


Final Scores). In his 
tives to the analysis 


riances within groups of the 
ps respectively. Then 


i sciprocal of the re: 
ed — p?) For the precision 


31 


Covariance Analysis and its Applications in Psychological Research 


The loss in precision is therefore on the average 1 


1 ‘= 1 (9 — ро)" : 
s, (1 — e?) Sy? — 2pc.0, + с.і 9,1 — p(y — 2ро;с, + в)” 


or, expressed as a fraction of the maximum precision (given by the analysis of covariance), 
РЕ 
(с. — есу)? бу E 


o, —2ec.c, F o; ~ с.\2 | oz ; 
==) +1]. 
а quantity which is a function of e and the ratio o./o,. 


As will be seen, there is little loss of precision when s,/s, is approximately equal to р. 

еге с./су is not of the same order as P, and where a straightforward analysis of variance 

of differences would involve too great a loss in precision, the analysis of covariance can still 
be avoided by multiplying the values of the X and Y variates by factors, a and b say (so that 


var [aX] + p? var [b Ү]), and then carrying out an analysis of variance of the differences 
bY — aX. This is sometimes a practical procedure. 


xxx Aof V. 


ec A of V of Gains 
pcs А OF C 


The diagram 
treating ex Aum Shows the v 


N. GOURLAY 


. ln educational experimentation a common situation is to have с; equal (or approx- 
imately equal) to c,. This arises (a) when both the initial and final variate values are the scores 
ona standardized test, and (5) when scores on non-standardized tests have to be normalized 
before analysis in order that any pronounced skewness in score distribution may be removed 
and the condition of normality satisfied. Writing o. =o, in the above expression, the loss 
1n precision becomes (1 — р)/2. It will be seen that it is less than 10 per cent. if р is greater 
than 8 (which is often the case if the same test is used both at the beginning and end of the 
experiment) and only 15 per cent. for e equal to 7. Taking the precision of the analysis of 
covariance as unity, Fig. 1 shows how the precision of the analysis of variance and of an 
analysis of variance of differences varies with the value of p. The same considerations hold 
good in more elaborate experiments where several factors are controlled, the only difference 
being that o.?, с,2, and e now apply to the population values of the variances and correlation 
for the interaction term providing the estimate of error variance. 


__ The reader may wonder why in such cases one should prefer an analysis of variance of 
differences to an analysis of covariance. The answer is simply that it saves considerable 
Computational labour, often as much as two-thirds, particularly when the experiment 
involves several factors. It also avoids the loss of 1 d.f. involved in an analysis of covariance. 
A good practical procedure is to try the analysis of variance of differences first and to use 
the analysis of covariance in cases of borderline significance. 


IV. SUMMARY AND CONCLUSIONS 


ariance have been considered with 


l. The various uses of the analysis of cov С 
een discussed under 


Special reference to its application in psychology and have bi 
three main headings : 

(а) where covariance is used to increase the precision of an experiment ; 

(b) where it is a means of analysing the relation between variates which have 
been measured after treatment of experimental groups ; 

(c) where it is applied to data obtained from non-random groups. 

2. In each case, speci 
> (i) the necessary statistical conditi 
analysis ; 

(ii) the exact nature of the statistical hypothesis inv 
n (iii) the conclusions that can validly be drawn from t 

Ypotheses ; ae ame 

(iv) misinterpretations of covariance analysis which have been made in text-boo 

and other publications. 


al attention has been given to the following points ; 
ons for the valid application of covariance 


olved ; 
he results of testing these 


ticular conditions under which 


ne the par! E ision." 
defir P * increases precision, 


3. An : : 
. it d n made (0 5 
nt Ман coat e use of covariance 


It is correct or incorrect to state that th 
i il with other 
4. The analysis of covariance has been compared in gome CoD and 
Methods of analysis—the analysis of variance for jan P к Che 
the analysis of variance of gains (or differences). It has beer ee. in certain circum 
maximum precision is normally obtained by means of ATE of variance to the 
lances very little precision will be lost by applying т eh simpler. 
Sains (or differences)—a technique which is computationally 1 
33 


с 


Covariance Analysis and its Applications in Psychological Research 


REFERENCES 


1. Lindquist, E. F. (1940). Statistical Analysis in Educational Research, New York: Houghton 
Mifflin Со. 


. Snedecor, G. W. (1948). Statistical Methods. Тома: 
McNemar, Q. (1949). Psychological Statistics. New York: John Wiley & Sons, Inc. 
Kendall, М. б. (1945). The Advanced Theory of Statistics, II. London: Griffin and Co., Ltd. 


2 Iowa State College Press. 

З: 

4. 

5. Bartlett, М. 5. (1936). “А note on ће analysis of covariance.’ J. Agr. Sci., XXVI, 488-91. 
6. 

7. 

8. 


Fisher, R. А. (1946). Statistical Methods for Research Workers. London: Oliver and Boyd. 
Fisher, R. A. (1949). The Design of Experiments. London: Oliver and Boyd. 


Edwards, А. L. (1950), Experimental Design in Psychological Research. New York : Rinehart 
& Company, Inc, 


9. Johnson, P. O. (1949). Statistical Methods in Research. New York : Prentice-Hall, Inc. 

10. Jackson, R. W. B. (1940). Application of the Analysis of Variance and Covariance Method to 
Educational Problems, Bulletin XI, Dept. of Educ. Research, University of Toronto. 

11. Rao, C. R., and Slater, P. (1949), < Multivariate analysis applied to differences between neurotic 
groups.” Brit. J. Psych., Stat. Sect., П, 17-29. 

12. Welch, B. L. (1935). * Some problems in the analysis of regression among k samples of two 
variables Biometrika, XXVII, 145- 


34 


Vol. VI The British Journal of Statistical Psychology May. 
Part I 1953 


THE DISTRIBUTION OF TOTAL RANK 
VALUE FOR ONE PARTICULAR OBJECT 
ІМ т RANKINGS OF п OBJECTS 


By J. W. WHITFIELD 
Psychological Laboratory, University College, London 


I. Problem. Il. Principle of Distribution. ІШ. Moments. IV. Example. 


I. PROBLEM 


_ It is sometimes necessary to pick out for particular study one object only ina 
Series, as when an experimental object is presented with control objects for preference 
Tanking or some other form of judgement. A similar situation arises in experimental 
Social psychology, when one person in a group is instructed to play a predetermined 
role, and each of the other members of the group is asked to rank his fellow members 
(including the experimental person) for certain characteristics. In these cases the 
experimenter is concerned with the ranks given to the specific object or individual. 
For this Kendall's (1948) treatment of the general nt ranking problem is inappropriate ; 


and the following is suggested. 


II. PRINCIPLE OF DISTRIBUTION 
... Consider one ranking of the л objects. The particular object can be assigned 
any rank from 1 to n ; and in the null case each value will have ап equal probability. 
nterest lies in whether the object is ranked high or low, according to the experimental 


hypothesis 5 Pr a second ranking is considered, the distribution of 
under investigation. If a seco 8 lues from 2 to 2n ; but the 


the total : ! k 
of the ranks assigned to the object can take va it th 

mu case no longer gives caval probability for each total value. The frequencies in 
Ne null case are made up in the following way. With a single ranking only (m = 1) 


we have 
Rank Values Total 
T 4-і) n 
S n 


Frequency 1 1 1 | 
i j he first, we obtain forme e 


Adding a second ranking to each possible value opti : 
Total 


Ist Rank k Values 
Blas Total Бар Qn —2) 0л-1) Q9 0) 
——- 234... BODEN ue 
= —— n 
3 ІУ б! 1 n 
3 I 1 1 1 n 
3 . . 
1. 1 А Е 
5 2 Я Door Ж 1 1 n 
ns] : ET 1 1 Боо 1 1 1 п 
n 1 Жел” 
SE 9) 1 
Total 3 2 1 n? 
Tequency @—-l) 5 


c* 


Distribution of Total Rank Value for One Particular Object in m Rankings of n Objects 


This illustrates the general principle that the fre 
of n objects consists of the distribution for m 
staggered fashion—a principle which simpli 
and facilitates the general expression for t 
appendix (pp. 38-40) tables of the exact 
nupto 8. For higher values of m or nan 


quency distribution for m rankings 

— 1 rankings repeated п times ina simple 
fies the computation of the exact values 
he second and fourth moments. In the 
probability values are given for m and 
ormal deviate approximation is possible. 


ПІ. MOMENTS 


It will be seen that the distribution is symmetrical, and therefore the higher odd 
order moments are zero. The mean total rank value is іт(п + 1), and deviations 
from this are to be tested. Let x be the deviation. 


Variance 
Ex? = nEx? + 2ge iia uii cR (ey Ұл 2 (n odd) 
т m-l z ! 
Ss a TL за Sy n—1\2) 
Ше уыр + 2n ! 15) + (3) + (5) + -.(- pos (even) 


(since Хх? is the sum of x? 


in the m ranking case, when all the n 
m 
are considered) 


equiprobablé occurrences 


т-1 


came. ON) 


m-1 m-2 1 


Thus (Xx? = лауа + rw) 


and hence Ex? — пит 1) =1) 


and the variance therefore 
m(n? — 1) 
12 d 
Fourth Moment 


Ix! = nix! + 2x6Xxt!124- 2? + au = 2) 
т т-1 m-l | 
( 4 
2n" 14 41 341 HE 
HNL yh sot} E. E 1% sf (п odd), 


Goo өр т."т(т а Oo 1) mm — 1) mma 1)... (x —20— 1) 
)! (x—5n—m)! н 2! $ (х- 2n— m) ! 


ong as none of the factors are negati 
to be calculated ар 


, For tail-values, only the 
5 we divide by nm, Тһе 


n in the appendix, and 


T would make it possible 
еп equiprobability cann, aed 
36 


ot be assumed, 


J. W. WHITFIELD 


with corresponding values for л even, 


n(n? — 1) , 
24 


E aga BOE = D —4) | орна n(n? — 1) 
160 + 2n ad 


nZx*-- 125х3 


т-і m-1 2 


nix! + (т — 1)п" (п? 


т-1 


n(n? —1) , n(n? — 1)(п° —4) , an? — 1 

DRE терін 50 pages) 
Therefore nxt Е wat + (n — J(u? 1) ae 1) n(n? iu 4) Dust d) 
Hence PC T D n )--(т-2)--...1--ҙ4 


тп"(п? - IGP — 4) | mn”(n? — 1) 
80 i 48 


Lui. D*mG — 1) | meg? — Do? —4) , mm? — 1) 
48 р 80 1 48 4 


Therefore u, = m(n? p = Dus —1) 4 CE + n 
f 
Ba 3(m—1) , 90 —4) , 3 
id m E Smm — 1) mm — 1) 
=3 6 12 


7 ӛт Sm — 1)? 

"3 B penam which tends to the value 3 as m and (to a lesser 
a discontinuous variable, a correction of 3 is applied ; 

therefore 3 


extent) піпсгеаѕе. Since x 
and the normal deviate is 


ND. | Total Rank Value — 4m(n+1) | =}, 
[m (i? —1) 
NF aer 
fhe The effect of tied rankings has not been considered. However, if ties are present, 
о Tue variance will be slightly less than that given above for the appropriate m 
tha n. Thus the normal deviate obtained by this procedure will be slightly smaller 
fe n the true value, and the corresponding probability slightly too great. If a signi- 
Bie value is obtained when tested against the uncorrected variance, the presence 
ties in the rankings will merely mean that the result has slightly more significance 


than the calculated figure would suggest. 
қ IV. EXAMPLE 
Eight judges rank seven objects. The experimental object has 


a total rank value 
14 be obtained by 


Dd gar is the probability that this or а higher value wou 
Varianc mu DEM 
e -$e-p»-23; 


Normal Deviate — 3—4 X 87 +D —3 = 1:8556 ; 
P = .0318 (a 4/32 
гох.). E 

1 , To obtain the аж value from the tables we have to reverse the direction of the 

Ceviation, as the other tail only is tabulated. The corresponding deviation in the 
Value. In the example this is 64 — 43 — 21. 

a total rank value of 21 or less would be 

proximation agrees closely with this 


K REFERENCE м 
endall, M. G. (1948). Rank Correlation Methods. London : Griffin and Co. 
37 


Distribution of Total Rank Value for One Particular Object іп та Rankings of n Objects 


TABLES OF EXACT VALUES OF P FOR TOTAL RANK VALUES 


The probability values given below are for deviations in the predicted direction only, and should 
be doubled if absolute probability values are required. 


m= 2 
n 
Total Rank 
Value 3 4 5 6 7 8 
2 ‘11111 06250 04000 02778 -02041 "01563 
3 33333 -18750 -12000 -08333 06122 04688 
4 37500 24000 16667 12245 09375 
5 40000 27778 20408 15625 
6 41667 30612 23438 
7 42857 32813 
8 43750 
m= 3 
3 -03704 01563 -00800 -00463 -00292 00195 
4 -14815 -06250 -03200 01852 ‘01166 » 00781 
5 :37037 :15625 -08000 04630 02915 :01953 
6 31250 16000 ‘09259 05831 -03906 
7 -50000 -28000 -16204 -10204 -06836 
S 42400 :25926 -16327 -10938 
T :37500 -24490 -16406 
А -50000 34111 -23438 
n 44606 :31641 
13 40044 
m = 
4 01235 ‘00391 00160 00077 
. 00042 00024 
2 06173. 01953 -00800 00386 -00208 -00122 
6 :18518 -05859 -02400 ‘01157 -00625 -00366 
0 :38272 -13672 05400 02701 01458 -00854 
-25781 -11200 -05401 ‘02915 -01709 
9 -41406 -19520 -09722 -05248 03076 
10 -30400 15895 ‘08746 05127 
15 "43200 "23920 -13578 -08057 
13 "33565 :19783 -11987 
14 "44367 "27280 -16968 
15 35860 -22974 
16 45190 29907 
17 à 37598 
45801 
38 


be doubled if absolute probability values are required. 


nc 


J. W. WHITFIELD 


TABLES OF EXACT VALUES OF P FOR TOTAL RANK VALUES—continued 


T ... 2 ж . . . 
he probability values given below аге for deviations in the predicted direction only, and should 


Total Rank 
Value 3 


"00412 
"02469 
"08462 
"20988 
"39506 


"00137 
"00960 
"03841 
‘10700 
:23045 
"40329 


00024 
:00171 
00684 
‘02050 
04980 
-10254 
-18433 
:29590 
"42920 


n 


:00006 
-00045 
00179 
00538 
01344 
02918 
05649 
09907 
15994 
23968 


:33606 


:44397 


00002 
:00015 
00060 
"00180 
:00450 
00990 
:01968 
:03588 
:06076 
:09647 
:14463 
:20585 
-27939 
"36310 
:45358 


39 


Distribution of Total Rank Value for One Particular Object іп m Rankings of n Objects 


TABLES OF EXACT VALUES OF P FOR TOTAL RANK VALUES—continued 


The probability values given below are for deviations in the 


predicted direction only, and should 
be doubled if absolute probability values are required. 


m=7 
n 
Total Rank 8 
Value 3 4 5 6 7 
7 00046 00006 00001 ` 204 051 pee 
8 00366 00049 00010 00003 ‘00001 0 02 
9 ‘01646 "00220 00046 00013 00004 О 
10 05167 00732 00154 00043 00015 beer 
il :12986 01971 00422 00118 -00040 ‘000 
12 24691 04492 01005 -00283 -00096 00038 
13 41015 08936 02125 00610 00208 00082 
14 15820 04070 01206 00416 00164 
15 25305 07162 02209 00775 00307 
16 37012 11686 03787 “01359 00543 
17 50000 17824 06122 02260 00915 
18 25574 09388 03584 01477 
19 34714 13716 05445 02293 
20 44794 19170 07954 03432 
21 25717 -11205 04972 
22 33216 -15259 06987 
23 -41421 :20137 09543 
24 . -50000 25802 12693 
25 "32161 16466 
26 39065 20864 
27 46315 -25856 
28 -31377 
29 -37329 
30 -43586 
31 -50000 
m = à 
8 "00015 "00002 053 0%6 052 076 
9 00137 00014 00002 «055 052 065 
10 "00686 00069 “00012 “00003 058 053 
11 :02393 :00252 00042 -00010 -00003 -00001 
12 ‘06447 ‘00743 00127 00029 00009 00003 
13 -14129 01854 00327 “00077 -00022 -00008 
14 -26078 04033 00750 00178 ‘00052 -00018 
15 ‘41564 ‘07805 01555 00379 ‘00111 00038 
16 13638 :02959 :00745 00222 ‘00077 
17 21768 05210 01369 00415 00144 
18 32034 08573 02369 00736 -00259 
19 43826 :13263 ‘03887 01242 :00443 
20 19392 “06071 “02007 "00727 
2l 26918 09065 03113 01152 
an "35622 12983 04654 ‘01763 
21 "45115 17888 “06724 02616 | 
25 "23771 09406 :03770 | 
% 30540 :12769 -05289 | 
27 | "38017 ‘16852 07233 
38 45953 :21655 :09656 
29 27135 12602 
30 33203 :16095 
31 39728 :20139 
32 46543 :24714 
33 29772 
34 :35237 
35 :41012 
46982 
40 


Vol. VI iti isti 
М The British Journal of Statistical Psychology May, 
1953 


NOTES AND CORRESPONDENCE 


ON MAKING STATISTICAL ASSUMPTIONS 
By A. 5. С. EHRENBERG 
Institute of Psychiatry, Maudsley Hospital 


In psychological circles there are several schools of thought whi i 
Waser re Ч ich regard т 
M al eos, particularly if at all complex, with distrust. The GHC I a 
Champs E ogical as opposed to empirical lines. Zangwill (1), for example, bases his criticisms on 
quantit pronouncement that “it 15 at least very doubtful whether the concept of measurable 
en nay be applied at all to psychological qualities ' (2); and we are repeatedly told that the 
реци 9 whether or not psychological qualities are measurable is a matter of logical or formal 
iS ths D Dn. _ At the same time the commonest line of attack on numerical methods in psychology 
oE sieh omplaint that their use involves too many ' assumptions. Clearly, therefore, the nature 
T nd so-called * assumptions and the possibility of error either in overlooking or in relying on 
re dures some examination, even without explicitly entering into the wider issues involved. 
Abed E three kinds of error can occur in setting out the assumptions on which an argument is 
explained E some of the assumptions implicit in the argument may be omitted or insufficiently 
E econdly, the nature of the relevant assumptions and of their logical status may be 
erpreted. Thirdly, so-called assumptions may be introduced which cannot strictly be regarded 


as assumptions at all. 

Dy puo Шизітајв the need for a closer scrutiny, 1 shall refer more particularly to the recent volume 

von. nor Harold Gulliksen on Theory of Mental Tests (3), and the comments passed on it in a 

RRN pu lished in the British Journal of Psychology (General Section, XLII, pp. 80-1) over the 

Statistica’ G. C.' Here the reviewer marshalled a set of ‘assumptions’ requisite for the application of 

Sami al procedures in a manner which 15 admirably concise and so lends itself readily to a brief 
nation. In the review printed in this Journal (У, рр. 134-5) it was stated that in Gulliksen’s 


book “ the assumptions and definitions are carefully formulated " : and this opinion also, as will be 


see . : . 
n from what is said below, seems much too generous. 
a well-known fault; and 


. Failure to give a// the assumptions inherent in an argument is j 
is still commoner. This type of error 


fai à 
ilure to stress and explain the most important assumptions _i nus ty 
the use of normal distributions. Thus 


с 
сап be most simply illustrated by a familiar problem, namely, 
book we find that the requirement of normality for Wilks’ 


all hi fact that Gulliksen emphasizes that almost 
the js formule are true, irrespective of any assumptions about the distributions involved, But 
Bechet in applied mathematics is not 50 much whether the formule are true’ (that the 

ologist may leave to the mathematician), but whe 


deali i istributi roblem appears particularly unfo | 1 
E ұшады distributional por ed the more advanced investigator 


Sys i distribution of his dat с y, 
tematic experimental designs (e.g., in working with an experimental and a ' contro group) 
for ; t times there seems to be a peculiar difficulty. in detecting the more isolated omissions. P. en, 
Instance, in the two reviews mentioned above, it is stated that the assumptions in UP sen’s 


” i fully 
are sc i / u are made " or that they have been “ care 
formulated,” rupulously pointed oui warns er ey the fact that certain assumptions have been 


: the reviev 2 ve been mı : 
Stressed into Supposing ath ni 2 RER, have been mentioned. Қып in the way their nature 
Or their hen assumptions are explicitly stated, there 15 often some contu ins accepted iby 
n ES Status is treated. Not infrequently it seems to "cdm y 
fied t Of faith. In the review already quoted, for example, certain $= 
» and the writer then asks : ‘ How many psychologists woul 
were Ori 


ny comments e 0 
e fully in the Statistica 


Ih | G i i d that 
T. inally submitted, suggested tha 
« 4.6 editor of the General Section, to whom 1 Б ECT , z 


it : 
might be better to state the arguments mor 


4l 


Notes and Correspondence 


H i he 
i i i i Thus, in regard to t 

xamples can be found in the discussions of normality. Thus, i ker 

ater ki CER "SCORES the reviewer tells us that * without this or Sone similar ot fore 

^! But this normality is not an ‘ assumption ’ tha 21W 

progress cannot be made. jut thi x 1 E Sa E 
i ing ° willing to do so. What is needed is an empirical А s 

ie ee cae by being Б. cal distribution function, possibly normal, possibly 


into a matter of descriptive statistics, petor simpl 
assumption. If the data cannot be adequately described by normal distributions, then theo 
formule which assume normality should not be used. 


nts where a mathematical model 
As an instance, may I refer once again to the 
ic assumption that a person’s raw 
As it stands, it would be impossible 
to show that this statement is either true or untrue. We might perhaps amend it to read: * Let us 
i involving two variables, which may be called * true’ and.‘ error ы 
Scores, and may be defined as follows , . . But the specification of a model is in itself not a matter 
of assumption or proof. For much the same reasons it is misleading to say that the true ^ and ‘ error 
Such a statement would in effect be part of the definition of 
proposed. 
ly of practical Consequence. Instead of asking whether the 
in some way not explicitly defined), and assuming its * truth 
ons, it becomes the investigator's business first to ascertain 


the ally consistent, and then whether it furnishes àn appropriate scheme 
"for describing the data. е 

Finally, three general points may be briefly considered, 
to be defined ? 


Secondly, what is the е 


First, how is the concept * assumptions ` 
Professor Gulliksen's book ? And la 


rguments on the reviewer's evaluation of 
timate to generalize the arguments here 


ffect of the foregoing a 
stly, how far is it legi 


from various quarters. Yet some suci 


assumption ’ is to invite criticism 
tions. We may begin by excluding bi 


» at ‘least for 


though inter- 
we may exclude 
nly provisionally 
I suggest, therefore, 
proposition which may or may not be true in fact, but which 
good if a given theoretical result is to be validly deduced. “An obvious instance arises in 


r 0 ating a parameter or testing significance. 
umptions of this type assert the n 


1 normality, linearity, ог homo- 
ons to which the formule are to be applied, In empirical science all 
such assumptions must in pi i i i 


i 8 book, ronounced by the re 
declares that * Psychological researc ІР i 
it is stated that “th 
arise from his doy 


;2y the reviewer in the General Section, 
‹ » . turn aside with feelings akin to horror.” Sinca 
e algebra is worked, out clearly enough,’ the reviewer’s reasons must presumably 
1015 over the assumptions made; Now, although on this latter count his argument 
! He appears to be in error. whe i i i iati 

5 n n he describes Gulliksen as assuming that the standard deviation 
of the ‘error’ Scores is normally distributed (cf. (3), р. 17), E enga 


42 


т 


a 


- 


Notes and Correspondence 


Seems unacceptable, nevertheless his rejection of the book might still be endorsed for quite the 
Opposite reason, namely, that the algebra and the statistical concepts appear, on closer scrutiny, to 
decidedly confused. 

Lastly, if it be agreed that the contentions here advanced are sound so far as they go, it is 
tempting to suggest that the main points may reasonably be generalized. Certainly, it seems likely 
enough that the widespread distrust of statistical methods in psychology is really due, not to the 
Profounder logical or formal difficulties that are commonly said to beset all attempts at numerical 
measurement, particularly as these difficulties are seldom explicitly analysed ; rather it would 
appear that it chiefly springs, however indirectly, from difficulties of a purely technical or empirical 
nature. Many psychometric techniques, for instance, prove to be extremely restricted in their 
applications, and in practice these restrictions are constantly forgotten. Others turn out to be 
inherently faulty as judged from a technical standpoint. 


REFERENCES 
1. Zangwill, O. L. (1950). Ап Introduction to Modern Psychology. London: Methuen & Co. 
2. Chambers, E. G. (1943). ‘Statistics in Psychology and the Limitations of the Test Method,” 
Brit. J. Psychol. Gen. Sect., XXXIII, 191. 
3. Gulliksen, H. (1950). Theory of Mental Tests. New York: Wiley. 


ON MAKING STATISTICAL ASSUMPTIONS : A REPLY 


à isti i ich mi ly be 
Mr. Ehrenberg’ making statistical assumptions, which might perhaps more apt 

described as a rine ve Ж Ed calls for a few comments. This is perhaps scarcely the pace 
enter on a controversy about assumptions. But, as the pilloried reviewer, I may say Шаб сода de - 
tions of space allowed me only 200 words in the General Section of the British Journal of 5) i ogy. 5 
in Which to describe Professor Gulliksen’s book and express an opinion on its value io p Les ў, 
г. Ehrenberg uses тоге than six times as much space іп the prend ofthe: Зза d ССА 
ENS apparently comes to the same conclusion—that the book is not us EORR Iive tie 


е « either 
Sens 5 i , "s claim that опе must always be able eith 
to E Something taken for granted» Mr. Ehre iet ice in the definitions of the Oxford е 
of Ctionary ; and it seems that he has forgotten that sanicl maoa ea d far Ертесі 
by Probability (* the calculus of uncertain inference ') and tha E. Gi C. 


43 


Notes and Correspondence 


THE SCIENTIFIC STUDY OF PERSONALITY 
A Note on the Review 
By H. J. EYSENCK 
Institute of Psychiatry, Maudsley Hospital 


Reviewers have a great deal of latitude in the way in which they express their opinions ; and 
no author should be too sensitive when he finds his most dearly cherished thoughts given short shrift. 
But eriticism must be of what the author has actually written ; it must avoid misquotations, factual 
errors, and other objectively fallacious arguments. The review of my book The Scientific Study of 
Personality (see this Journal, V, pp. 208f.) deviates so far from what is permissible that a note of 
protest appears necessary. I shall take up the reviewer's errors in numbered series, as they form 
an almost perfect example of the art of distortion. Н 

1. Change of sense through direct misquotation. On p. 211 he prints the following sentence : 
“~ Burt's system of classification,” he (i.e., Eysenck) says, “is some thirty years out of date, and 
discredited in contemporary psychology for many valid reasons." ' What I actually say (p. 41) is : 
“ Burt's work is based on the rating of the strength of McDougallian ‘instincts’ in children by 
teachers. The choice of a system of classification some thirty years out of date . . ." No one 
reading the passage in its context can doubt that it is McDougall's system which I consider out of 
date; the quotation is subtly changed in such a way as to give a completely erroneous impression. 

2. Change of meaning through indirect. misquotation. In the same paragraph, the reviewer, 
referring to Burt's 1915 Report, says that `“ Burt tells us, the correlations were based on * items of 
behaviour systematically recorded." This is incorrect. Burt does not tell us at all what his correla- 
tions were based on. He merely discusses different possible ways in which ratings could be obtained, 
and which of these methods he considers superior. 

3. Change of meaning through incomplete quotation. 

- .. has contrasted the factor of * emotionality ’ with the fa 
on to say that this is incorrect, and that Burt in fact con: 
However, I continue my paragraph by quoting Burt's ve 
he contrasts the two factors. Had the reviewer not brok 
could have judged for himself the accuracy of my interpretation. 

4. Disparagement through imputation. He discusses my claim that objective tests are superior 
to other types of tests. Because I do not quote the recent Symposium in the Brit. J. Educ. Psychol., 
he concludes that I am “ apparently unaware of the fairly comprehensive survey of the reliability 
and Validity of different methods there brought together.” He does not appear to consider the 
Possibility that other causes 


than ignorance may be responsible ; and that other psychologists might 
not agree that the survey was fairly comprehensive.” It Was not quoted because it was in my view 


So one-sided in its Presentation, and so restricted in its coverage, as to be practically valueless for 
my purpose. To construe this very existence is unworthy of a 


The reviewer quotes my sentence ** Burt 
ctor of * neuroticism ° " (p. 115). He goes 
siders the two to be “ virtually identical." 
Ty words, bringing out clearly in what way 
en off the quotation where he did the reader 


u х into ignorance on my part of its 
scientific review. 
5. Disp The reviewer criticizes m; jecti 
pres. Д ۷ 2 y use of objective 
tests by saying that the usual objection to such tests is that their reliability is low, and that their 
lidity 1 n He complains that my tables seldom include 
reliability coefficients, and that those I give are by no means high. The tests specifically mentioned 
ure to have reliabilities 
1 for instance, is above ‘9, 
e persistence has an estimated minimum 
; they would almost certainly be higher than 
ШЕ SE muc the main point is that I 
: 1 on Tather than on reliability, which is 
for the purpose of reaching higher validity. Не i 2 реса, 


retest reliability, 


реу Пау alid, and must therefor liable. 
igh falsification of argument. On p. 210, Сретен 


тірі; my critic prefers 
This is a complete mis- 


а 209 he says that in order to secure a unique and 


Notes and Correspondence 


invariant position I carry out “a rough graphical rotation.” In fact the rotation is precise and 
algebraic. I do not quite see how one could carry out a graphical rotation under the special cir- 
cumstances even if one wanted to do this. The “ tough graphical rotation,” therefore, is entirely 
an invention of the reviewer's : it finds no counterpart in my book. 

8. Misrepresentation through ignorance. 1п dealing with criterion analysis, he says : (having 
quoted my aim in rotating as that of Teaching a position for the factor-axis where the correlation 
between factor and criterion is a maximum): * one would have supposed that in doing this the 
investigator would eventually approach a position which was identical with the criterion-axis, and that 
the maximized correlation . . . would be simply the ordinary multiple correlation between the criterion 
and the tests.” This quotation shows that the reviewer has failed to understand the theory of 
criterion analysis. There can be no multiple correlation between the criterion and the tests, because 
the criterion is not included in the factor analysis, and cannot be so included, by virtue of the fact 
that the analysis is carried out separately on the normal and the abnormal groups, not on the combined 
Broups. This is the most important point in criterion analysis. If the reviewer has failed to appre- 
ciate the consequences that follow, it may be legitimate to query whether his understanding is adequate 
to the task he has undertaken. be 5 кы. 
у, 3. Misrepresentation through carelessness. Не follows the criticism quoted above by saying : 
“It is true that he (Eysenck) goes on to say that one may also increase the correlation by improving 
the criterion. . . . How this is to be done we are never told." The reader is told at length on pages 
76 and 77. W. L. С. would be justified in arguing that he did not agree with the arguments—or 
did not understand them. He cannot plead that they are not to be om. Е. ae 

10. Misdirection through non sequitur. The reviewer quotes very briefly my account of the 
hypothetico-deductive metod, as е the postulation of an hypothesis, the deduction of certain 
Consequences, and the verification of these consequences along experimental lines, He then declares 
it to be invalid by quoting Judge Jeffreys’ saying: “ If Mistress Gaunt had been guilty, you may 
rest assured she would have been burned. She was burned. Ergo, she was guilty.” Іп m 
obscure way, he seems to feel that this story is relevant, and disproves my own аташеа. 8 
ТЕПЕ to me a complete лол sequitur. I leave it to the reader to decide whether he regards this as 
elling and relevant argument. In connexion with my discussion of 
Scientific methodology he says: “То prove his (Eysenck’s) hypotheses, it is not sufficient to report 
experiments that are consistent with his corollaries : 


ces in actual fact are not to be found, 


E ut it is contrary to fact to say t lestis Very 
2. Use of alternative plea. The alternative plea is ver 
(tat he did not Kill the victim, but that, if he did, he did Баты 
rown On the use of this device, which is exemplified іп its p 
he ass э 1 mental states as q 
io. Sures the reader that my treatment of abnorma thease Toni 
роста! People is opposed to Burt’s assumptions чаба I have in е 
0 Systems and the concepts involved are similar, ап be true, they can 
a » While either one or the other of these arguments pane review ; it woul 
9 Other instances of this peculiar type of argument in 2 ; 
exhume them, : ghout his review, the writer attempts 
> Misrepresentation through wrong emphasis. prow ing with Burt, attempts to establish 
O relate my results to those of Burt, castigates me for not agree M generally writes as if no other 
DR Priority with Ee 5 such concepts as emotionality, 8да, ati ‘al Psychology had ever eee 
Sychologis 1 f the British Journal 0, In the first place, my book is 
9n the Subject 015 2 To T is erroneous in two ly DESVIOUS writers are mentioned, 
Та account of тА an historical treatise : Create all originally contemplati B 
Ut there is ds ver the ground. blication under the title The 
No systematic attempt to co separate publication ut 
е 1 В г as а sepal character аге 
Structure al Was too extensive, and ey Co a WU place, the 
eRpODriate to this later volume : they are irrelevant to the Det Sir, prem eere (оса 
lticisms m 3 here are simila that the early workers 
ade are poorly taken. If ther: ly to the fact tha 9 id d the 
о Ч е largely де) laid down 
and бе the last to deny, they presumably are red on the experimental si ) 
SS on the theoretical side, Неута! 45 


not both be true. There 
Id hardly be worth while 


D 


Notes and Correspondence 


to factor analysis. If, therefore, I should have entered into the historical side, the correct emphasis 
would not have been on Burt, but on these other, 


ther that of improving the underlying methodology 
and making more precise the measurement of the 4 


are a baker's dozen of serious misri 
of misunderstandings, and of complete inventions i 
The list could have been extended, but it is perh: 


knowledge of the field, his ability to understand the complex arguments involved, and his impartial 
fairmindedness in assessing the validity of proofs and demonstrations. 


THE SCIENTIFIC STUDY OF PERSONALITY: A REPLY | 
Methodology. ysenck's thirteen criticisms I shall assume that readers will 
more interested in the appraisal of hi 


with this that my reply will chiefly deal. 
sentation through suppression,’ 


8 а fallacious syllogism attributed to Judge Jeffreys, 
obscure way," says Dr. Eysenck, “ i ze i p ch 
argument,” 


Story," he will see that it was i 
“disprove” it, I went 


Eysenck's) 


y such a mode of inference was defective, 
5 In deductive logic the only way 
to prove a hypothetical premiss i: 
of elaborating the poi 


bing, loc. cit., p. 105). Instead 

point at greater length, T thought it sufficient to refer to several logicians and 

Statisticians who hi sed it, and whose objections Dr. Eysenck had ignored in his 
book, and still ignores in his reply, 


ow the " method cj 1 nce d from it have been verified ko 
in several “ important respects.” Of science ” improves on this inconclusive procedu: 


is,’ "invention, с "Pete quotation,’ * misrepresentation 
Sion Hei nb po LS etc., etc., all in a single paisna I will not 
tortion "5 but it Certainly presents а very mislead тоон рага оро fen 
of inductive Methods as appli i E 


ed to the physi picture of oer Stebbing’s own analysis 
46 3 


Notes and Correspondence 


The Disproof of Alternative Hypotheses. The obvious way to escape any such formal fallacy 
would be to attempt a direct disproof of all the conceivable alternatives to whatever hypothesis the 
writer wishes to maintain. In empirical science, however, the most we can seek will be, not deductive 
proofs leading to conclusions that are certain, but arguments based on probability, leading at best 
to an inference that the hypothesis favoured by the investigator is more probable than the remaining 
alternatives. I thought it unnecessary to amplify these points in any detail because they had already 
been adequately dealt with in the references I quoted. 

„But although in section 10 of his reply Dr. Eysenck seems to reject my general contention, in 
Section 11 he apparently admits it. He now claims that, “ for the main theory tested,” he did in point 
of fact “ disprove the alternative hypotheses." But he interprets this requirement in a way very · 
different from most other methodologists. Hence I still think my two chief criticisms were justified. 
First, the need for this all-important step is nowhere mentioned in his account of the * hypothetico- 
deductive methodology. Secondly, in what are described as the centralenquiries of the book, there 
18 no direct attempt to disprove the alternatives. We are, it is true, told that, where Kretschmer 
holds a different view, his methodology is ‘faulty’ and his attempts at proof * disingenuous.’ * 
For the rest the disproof of the chief alternative is treated as an incidental consequence of the proof 
of the main hypothesis, whereas the correct mode of reasoning would invert the order. However, 
In view of the way in which he seems to һауе missed my principal point, let us examine a little more 
closely his study of psychosis, since this is the part of his book which he complains that I ignored, 
and the part which has chiefly aroused criticism from medical writers. Here he claims, by means 
of the hypothetico-deductive procedure, to have established two far-reaching conclusions. z 

Hypothesis I. The first is the hypothesis that “ mental states diagnosed as psychotic by 
psychiatrists ” are merely instances of extreme variation in a single quantitative continuum—the 

dimension of psychoticism,’ which ranges from perfect normality to definite psychosis. The 
alternative would be the view that certain psychoses at least differ qualitatively from each other 
and from the normal state, much, for instance, as Korsakow's psychosis is ordinarily supposed to 
differ from general paralysis, and both from normality. In accordance with the steps described 
in his opening chapter, Dr. Eysenck first deduces what he takes to be the consequences of the 

quantitative hypothesis’ (that of continuity), and then attempts to verify these. But, so far S 
can discover, he deduces no consequences from the alternative hypothesis, nor does he undertake 


any independent investigation to disprove them. - Д 

о begin with, he ері all his es of psychosis ? into a single group. No attempt is асы to 
show that his ‘ 100 psychotic subjects ’ were typical psychotic patients ; indeed, from their ge ent 
Performances at the tests most psychiatrists would be tempted to suppose that they were | ighly 
Selected cases. He then argues that, if the psychotic group is continuous with the normal, tests 
which differentiate significantly between the two should yield positive Ihfer corre ions for eect O 
Separately (his italics). He adds that these inter-correlations should be ' PERO 15 the power 
Of each test to differentiate between the normal and psychotic groups,” the ' differentiating po 


Late ы ration that this corollary 
cing measured by a biserial correlation.* He offers no formal demone о а 8 Саа еі 


follows logica i hesis : he merely “points it out.” 
hae feats are blesi, ҚА Tables XL and XLI; and he thereupon pa e AN 
that “ we find our deduction verified, and it would appear that psychotic i 

that Miss Stebbing’s section on 


ETE. " » y: 
dt is in regard to the relation between induction and prohabiity S t Miss Duy usefully compare 


quss ienti 1 5 somewhat out of date. 
Poth her Vi Sa De DM ot thos put forward in Russell, B., Human Knowledge, 1948, 

In his Nb. ientific Inference.’ ШЕТ 
>In his Note Ea iere ел Ye. this point by declaring: | e is S 9 e. id 
that there is‘ nothing but abuse.” But I said по such thing. . What I did say м AM E 
alternative theories, fe seemed a little too apt to “ abuse the plaintiff азоо рта 
faché to mean that, instead of adducing facts to disprove the views herdin i tent be rolled sath (52 

Teely on an attempt to disparage the scientific competence of those w! 

"ia atem, k's statistical studies, the practical 
: atest i 0 sis ше саа he patients tested. We are 
ere “ divided into two equal groups 


Simply told that his “ psychotic subjects " (as he calls them) were | ails are vouchsafed to show 


ether the individuals so labelled were typie lof have thought that 


der these eat L ght th 
somewhat vague diagnoses. nee 2 with 
Psychologist was convinced of the need for combining case-studies wer? of his tests presupposes that 


We г 2 орг с ions and standard 
me ie two CODSHEDSDAR EDU (о RETRO ffect of selection on the uc that the hypo- 


i . Eysenck gives, 
t Vlations, So far as can be judged from the data Dr. Ey: 


ormal distribu tinuum i from be d. 
i i i is far fri ing fulfilled. 
й 1 distribution along the single conti 


Notes and Correspondence 


t tinuum with normal" A few paragraphs later he adds: “we have shown 
em S Chats are not “ qualitatively different from normal people," i.e., that the ae 
hypothesis is disproved. But the only evidence for this further inference is а parent paral 
assertion (unproved) that the consequences of his own hypothesis would not foll dd on 
any tenable counter-hypothesis.” Apparently he assumes that with any 'counter-hypothesis, 
e.g., the hypothesis that differences were qualitative, the average inter-correlation would be zero. 
Yet this is never proved ; and is it really so? Suppose we were investigating the analogous alterna- 
tives in a study of sex-differences : would Dr. Eysenck expect to find that if sex differences were 
qualitative (due, say, to an additional X-chromosome), then the inter-correlations between measure- 
ments for height, weight, shoulder breadth, etc., would yield an average correlation of zero in males 
and females taken separately ? ee 4 

Or consider the parallel question that arises іп the field of cognitive psychology : do patients 
in mental hospitals who have been diagnosed as imbeciles or idiots differ from normal persons solely 

use of some “ quantitative variation in a single continuum”? Take any two tests that will 
* differentiate significantly ’ between 100 normals and 100 defectives, say, any pair from Abelson's 
battery ; and calculate their correlations within each group taken separately. The coefficients will 
undoubtedly be, not zero, but positive and significant. On taking all the tests in the battery, we should 
find that the average of the correlations for each (and therefore its factor-saturation) would vary 


c ption which, on more general grounds, he took to Бе 
almost a foregone conclusion from the start. Thus, in his preliminary remarks on * the dimensional 
approach,’ he tells us that the whole notion t! 


ав appropriate for qualitative characteristics as Dr. Eysen 
Hypothesis II. 
Psychoses which Dr. 


imension ’ or factor, namely, “а continuum 
) гете cyclothyme” (ог manic-depressive). 
Dr. Eysenck’s twenty tests are designed to investigate this second hypothesis as well as the first. 
» Yield two factors. Of these the first is identified 
ic 1 e it is argued, the second should be identical 
n contrasting * schizophrenics’ with * depressives,' if such a dimension exists, 
1 Here it is surely Dr, Eysenck who is basing his о 


out of date.” What he says is not true of other sciences, “ 
to us,” says Schrödinger, “ owin If, however, we 
envisage the development of „century, we get the impression that the 
f 0 ries much against his will (Schrédinger’s 
wi а 1 cience and Humanism : Physics in our 
ee рр. 30, Bay e Same is true in the biological sciences, notably in the genetic study of 
abandoned: Win a € view that ascribed all differences to continuous variation has got to be 


а! itati i Sativa S 
The Inheritance of Moni] Dii foe differences, not quantitative (cf. Myerson, А., 
48 


Notes and Correspondence 


Now the saturations for the second factor reveal a zero correlation with the biserial correlations 
which measure the power of the tests to discriminate between * schizophrenics ' and “ depressives." 
Therefore Kretschmer’s notion of a second psychotic factor is disproved: “it would appear that 
schizothymia-cyclothymia does not exist as a separate dimension of personality." Moreover, since 
the scores of the schizophrenics are in general somewhat better than those of the manic-depressives, 
we may (it is suggested) reasonably conjecture that schizophrenia is, as it were, a mild form of manic- 
depressive insanity, intermediate between the latter and perfect normality. 

But surely there are other ways of accounting for the zero correlation and the scores. First 
of all, just because Eysenck’s second factor and Kretschmer’s second factor are both orthogonal to 
Psychoticism, it does not follow that the two must be one and the same ; several dimensions or factors 
might exist,! all orthogonal to psychoticism and to each other. Secondly, the saturations for Eysenck’s 
second factor are of very doubtful value. They are reached after a double reversal of signs ; and, 
whenever the initial coefficients are small, such reversals are extremely arbitrary : with a different 
mode of reversal, quite a different correlation with the biserial coefficients might have been obtained. 
In any case, the figures correlated have for the most part a decidedly low degree of significance ; 
of the biserial coefficients, at least 16 are definitely non-significant ; of the factor-saturations none 
of the negative figures in the normal group is significant, and two at most in the psychotic group, 
Thirdly, it is quite possible that such tests as ` reading prose,’ “ abstractions.’ * mirror drawing," 
and the like—even if their reliability remained reasonably high when performed by psychotic 
patients—were not suited to elicit the special characteristics of the schizothymic and cyclothymic 
temperaments. In view of the similarities between schizothymic tendencies and introversion, and 
between cyclothymic tendencies and extraversion, I should have thought that the methods adopted 
with some success by previous investigators in their endeavours to verify the factor of introversion 
and extraversion might have furnished more convincing evidence. The mere fact that “ the requisite 


deduction could not be verified” hardly justifies the far-reaching inference that is drawn. Finally, 
the test results themselves seem highly inconclusive. Dr. Eysenck’s first and most important factor 
Ў iance. His second factor is 


С psychoticism ”) only accounts for about 10 per cent. of the test-vari 
smaller still. Indeed, he himself does not venture to say what it really represents. a 
I do not doubt that most of the mental characteristics which Dr. Eysenck has tested, and many 


of those which, in an exaggerated degree, form the more superficial symptoms of the neuroses and 
milder psychoses, might. pepe terms of quantitative factors of the usual type. But that is 
y no means the same as “ showing that neither neurotics nor psychotics are qualitatively ЕШ ШЕЛІ 
from normal people.” The temperature, rate of pulse, rate of breathing, and other mensura e 
symptoms of patients suffering from measles or typhoid fever could be expressed in terma or pn or 
two quantitative factors. But that does not demonstrate that there are no qualitative differences 
1 n either of these and normal health. ae e 
Reliability of Tests. Іп my review, I ventured to suggest that possibly the loy егі BED E 

г. Eysenck has obtained—biserials, inter-correlations, and saturations—might be attribu e 
the fact that the types of test chosen were not likely to yield very reliable measurements, partial y 
i ought it unfortunate that Dr. Eysenck’s tables “ seldom 
Nevertheless, 


used (body-sway, persistence, and the like) ^ facto 
reliabilities ”. Үл. Pie ты references. The references. that I mur Pure ET aire 
ЕЗІНЕ low reliabilities for tests of the general Бре in question) һе dismi 1 
y case, he sa: ‘abilities for his own tests (or rather their 10 ) from 
ihe validity feel the rela Ue twenty tests used to discriminate psychotics from normals 
ighest validity coefficient was 0:402, and with those used t 
none reached 0-600 : the averages are about half the size. 
proving el the reliabilities are “ satisfia oI, E sen shos 
А at le.” М H асі; ioned whether 
iê valid pad cee Vy оета abled ] did not * suppress the fact ; exi. err 
In accordance with the principles of‘ criterio-analysis,” Dr. ALAS abo a er arion 
Obtained for certain factors reached by rotation. Once again he finds them палата not possibly 
therefore considers his conclusion to be confirmed. But in any case 
ed that his method of rotation was 


he fact that these tests have been shown 


'Scover any new dimension. 0 
sra pL Section 7 of his reply, he complains e Pata T at PPS a mistake, I apologize. However, 
graphical, whereas it was “ precise and algebraic. he says nothing about an ‘algebraic 


73), bout 
ed procedure e Its.’ He dismisses my remarks 
mu of the Toy saying that my ‘ understanding 
e urged by his former collaborator 
d the procedure. To these 


ea inital exposition of the pr opo e 
Шоп, and he does speak of “ a graph n 
On *eriterion- or. EHO SI (as he variously term: 
Ni it) is inadequate.” But my main criticisms e sorely 
г. Lubin (Psychol. Rev., LVIÍ, pp. 54f.), who must $ 
г. Eysenck still attempts no reply. 49 


Notes and Correspondence 


is poi iti i iti Eysenck 
i t. And at this point a critical reader might be tempted to ask whether it is not Dr. Ey: 
eee Rats e some of the relevant facts. In his book (p. 125) he states that 


the “testing was carried out by Drs. Clarke and Gravely” (who at that time were co-operating * 


і im at the hospital), and that “the account given by Clarke has been followed . . 2 
va Сана But та) Clarke tells us that “ none of the tests ” (which included “ body sway 
and others “ selected in consultation with Dr. Eysenck as the most promising 7)" has a sufficiently 
high validity coefficient to claim any practical value for screening or diagnosis " (Brit. J. Educ. 
Psychol., XX, p. 202). Dr. Gravely reports very similar results : (see the theses quoted by Dr. 
Eysenck). These findings scarcely support his present line of defence. хай 

Priority. Dr. Eysenck’s last and longest section is devoted to the question of priority, He 
complains that “ throughout [my] review [I] attempt to relate [his] results to those of Burt, castigate 
him for not agreeing with Burt, attempt to establish Burt's priority with respect to such concepts as 
“emotionality ' " (a concept which, he says, was really due to Heymans and Wiersma), and generally 


that I “ write as if no other psychologist had ever written on the subject of personality." The last 
sentence is surely an instance of what Dr. Eysenck has labelled “ misrepresentation through mis- 
statement of fact." If he will be good eno 


ugh to refer once again to my review, he will see that I 
refer to more than a dozen other psycholcgists by name, including * 


* eminent writers on personality, 
like Allport . . . ,” and several others whom Dr. 


Eysenck himself had ignored. I made no attempt to 
establish anyone's priority in regard to the concept of emotionality. The sole priority with which I 
was concerned was priority relative to Dr. Eysenck, and that in respect of his general Scheme of factors, 
not in respect of one particular concept. Nor did I “ castigate ” him for not agreeing with Burt ant 
simply regretted that, on the numerous points about which he did agree, not with Burt merely but with 
other investigators, the only reference he generally gave for the “ demonstration " of the several 
factors was, as a rule, some particular paper of his own. He replies that his work was not meant 
to be “ап historical treatise." But I was not asking for a history, but simply for a less misleading 
mode of reference. His * bibliography ’ runs to more than 300 numbers ; yet this, like the text, 
omits most of the previous British investigations which have dealt with the factors he describes and 
reached much the same conclusions. I had presumed that the omissions might be due to the fact 


that he was unfamiliar with the earlier English literature. He rejoins that apparently I did not 
“consider the possibility that other causes than ignorance might be responsible.” I did; but 
thought it more charitable to dismiss them. е 

He professes surprise that I should relate his scheme to that of Burt. There are two obvious 
reasons. First, apart from the Single factor of psychoticism and a slight change in nomenclature, 
the two schemes seem almost identical. Secondly, Dr. Eysenck received his own psychological 
training under Burt, and carried out his earlier factorial researches in his laboratory. Indeed, in 
his previous book he had proposed “а theory of personality organization,” which, he said, was 
* based on Burt's views of the nature 


B of the statistical factors and on the hierarchical theories of 
McDougall. 


workers like Jo 
Wiersma used “ 
in order to dem 
the resemblances, I confess, is som 


will show, his first investigati 
methods was 5 ] ате 


star т ее years before the date gi Dr. ск 
бс d publication by Heymans and Wiersma. Dr. E гуа 
ased, 


All 
to the 5 
Dr. Eysei 


But I venture to think that 
personality are also 


origins of the factorial study of 

ЕР ое 1 h Е lo; Nevertheless, it is well to be 

OE ork of early continental writers, which British writers have perhaps too often 
Misquotation. The rest 


refers to * Burt’ n appears 
factors discovered b alsine arts 1915 Report" describ 


T 20 in commenting on his account I 
BRO itota ne ben coefficients were *' based, not on impressio: s 
since Burt ** does y Я ys, I am guilty of misquotation, 

not his correlations were based." If so, гра to Professor 
50 


Notes and Correspondence 


Burt; but the * 1915 Report’ explicitly states that the investigation “ 

1 1 at igation “ departed from 6 

hitherto used in studies of character qualities " and that the een correlated беда сод 
90: items of behaviour “ systematically observed and recorded for all members of the groups.” 
t is therefore Dr. Eysenck whose account conveys “а completely erroneous impression." 


с but was also found among those 
and from this he concludes that the factor is “ little else than the factor 


It is thus Dr. Eysenck who writes as though only one psychologist 
of repute had put forward opinions differing from his own. 


different from normal people is opposed to Burt's assumptions." 
urt's assumptions in regard to adult psychotics, which, for all I know, may be much the same as 
* would be opposed to Eysenck's assumptions, 


and would probably reply that such differences “were to be expressed in terms, not of a single 
I supposed that anyone familiar with the 


15 numerous complaints about the way he supposes . ispal y 
express a sincere admiration of 


readers to the fact th i 

t my review went on to 2 сасар 
andIm аст х “j i d industry it displays. 
ore than once paid a tribute to the “ ingenuity anc? ту р! yS er trifling points about 


the incompl 
8 teness of the arguments or quotation 
diverted att Edd g i i d to stress. I had hoped that the 
sues I endeavoured to 5 1 at t 
{rated Sachi ео impor the conflict between Dr. Eysenck and the psychiatrists 


4) and perhaps elicit a fuller an 
baer eit i i thod used is also indicated in the 
i that th L.C.C. Report. But the metho 
il abstract of his British, pie = paper. Pif Dr. Eysenck still has gani ontis eee 
wail find the experiment fully described in Burt's later article in Character ал Lae a Bae e 
cthod itself is again explained in Brit. J. Med. Psychol., XVII, p. 164. et mi TM erred 
by meet Dr, Eysenck’s criticisms and verify his q ions, I have found mnyselt cont ТЯ 
у (Ве careless way he gives his references. In his bibliography at в) One TiRED 
dicem here is nothing icr by pur. оп 
ers is bein; 
cri nage at all. Passing to the nee dh RE ote 105 
2264 in secti tt ‹ was 
Eleni Jas re are many HP he rind conse es 
general stor of neurolicom L. Reymert), To pecialized factors for certain types of neurotic dis- 


Order. The former he believes to be wrongly named, and to 


Which, he iatri dpoint 
Р suggests, ате from а psychiatric stan poi 
р Sed 200; а Cattell іп his Description qa VEEE 
е numero ited by Raymon 5 B icism."). Cattell himself writes : 
us references cited by and * Neuroticism s аи іше factoring 


х pSorality (Inde ‘ lit 
; X, S.v., “ General Emotional 
oot obably the worst example in applied. psychology ; пасо 
and 15 With respect to the alleged * neuroticism factor ; ticism factor.’ ", He himself identifies 
it d the first has n laim than the others to be * the neuroticis™ Sut is as important in the 
CT factor СУ This, he says, * ties up with vem emos (cf also id., Personality, p. 502). 
Саа of delinquency as in the causation of neurotic disturbances : 4. ^ 

51 


Notes and Correspondence 


my real objection was not 
il, but rather that a purely 
was pleading for was a more 
ple, which has proved so successful in “ unravelling 


€ I have seen references to * Labora- 
ychology of Humour and (2) the Conte 
Tests. If copies are stili available, we 


ve some indication of the contents 
Reply.—A few copies of the Notes are still available, and can be obtained by writing to me at the 
Department of Psychology, University College, Gower Street, London, W.C.1. The note on 
; Humour" includes a fairly Complete summary of the theories of humour, and a report of illustrative 
en the note was prepared) dealing 
2 orrelations are given both * between 
d * between Persons,’ together with the results of the factorial analyses (based mainly on 

i ach Test tentatively puts forward a 
е contents of the replies rather than 


PUBLICATION OF TABLES 


3 at authors who wish to contribute statistical papers to any of the 
journals published by the British Psychological Society should be required to supply copies of the 
Tences are based. If possible, the tables should be printed with the 

с i his is impracticable, then the 
rence in the Society’s office, 


thesis, On examin 


г. O'C erh put the matter most emphatically. “ Dr, p , od," he 
Writes, “ js se Saeed table . 5; 59 pleasing to that kind of scientific mind Which sence to remain 

m the human element (Bull. Brit, Psychol. Soc. ІШ, 1952, р. 115). Dr. Maddox has 
; and several later reviews, I find, 
52 


Vol. VI iti ізгі 
ne The British Journal of Statistical Psychology May 


BOOK REVIEWS 


Contributions to Mathematical Statistics. B. i i 
a . By R. A. Fisher. New York: 
Inc. London: Chapman and Hall. 1950. Pp. 680. £3. e E 


This large volume contains the first comprehensive collection of Sir Ronald Fi it iti 
5 lum isher’s wri 
ша Dema ical statistics, and forms a notable addition to Wiley’s series of statistical SUES 
aes ection has been made by the author himself. Forty-two contributions have been taken from 
a i a lectures published before 1943 ; a further note refers to three others, not actually included, 
HAS ich one appeared as early as 1915. The papers, some of which were printed in octavo journals, 
още S in quarto, have been reproduced on their original scale by a photographic process. Professor 
ТӨЛЕН has made a number of deletions and emendations, and provided brief introductory notes, 
k ang the various contributions with each other and setting them in the context “ of the common 

nowledge, or common misapprehensions, current at the time they were written.” An index to the 
whole has been prepared by Professor S. S. Wilks. 

The modern theory of mathematical statistics has been almost entirely created during the last 

seventy years. The Contributions thus cover nearly half this period. Although in many respects 

ey form a fairly self-contained body of knowledge, it is scarcely possible to understand the achieve- 
ment they represent, except in relation to statistical developments during the earlier half of the same 
pene „This witnessed the appearance of the long series of publications by Professor Karl Pearson 
“M is * biometric school" ; and it might be said that the place held in the earlier phase by Pearson’s 
hel E Contributions to the Theory of Evolution ’ and kindred papers is as central as that 
Sad. by Fisher's Contributions in the later. Under Pearson's leadership, detailed investigations were 
а айе of frequency functions ; these led to the well-known Pearsonian classification of frequency 

urves and the use of moments for fitting them to empirical distributions ; most of the multivariate 


descriptive procedures appropriate to biometry, including nearly all the familiar coefficients of 
matically developed ; and the problems of sampling, and of 


les, were studied both experimentally and mathematically. 
dern frequency distributions—was rediscovered, 
and has since become closely associated with Pearson’s name. On the other hand, the nature of 
fectly understood. The method of inverse probability, with 
routine procedures were regularly based on 
phrase * more than three times the 


Ges of association in a 2 х 2 table. For its prob: 
5 CEDE ago as 1913. Yet to the present day no exact sampl 
2 relation so computed. Thus, not even 1 
sam; li possess any rigorous test of signi! 
pling distribution of the product-mom 


oth Gosset (‘Student °’) and Soper had given approximations. ОЕ f 
Neverth i f statistical inference remained confused, the line o: 
eless, while the general probi io Gosset, in his classical paper * On the Probable 


Progress had been indi as 1908 when 

Tror of the Mean arcem eti pp. 1 er seg.), published the t-distribution. On the pen Band, 
€ theory of statistical estimation remained extremely backward. The importance оГ“ idem о 
Som * was entirely unrecognized ; and thus the essential foundations of modern statistical analysis 

Still to be laid. М ^ 
It was shorti; i Fisher made his first important contribution by solving 
y after this that Professor Fisher i t : ka X. 1915, 
istributi tion coefficient (Biometrika, 5 
ct distribution onda in Metron * on the probable error 


Pp. 507- Н ticles 
aa ee few years later DY sample d (which incidentally gave the exact 


ion of the partial correlation 
ject of the note which now makes up 


Sampling distributi i lass coe! 

Coefficient." 5 ution of the intra-class cy forms the subj 1 kes | 
P. 2 The substance of these three artic es. p does not permit republication 
f * the established policy ee The 1915 paper contained 


Der 1 of the S emi 
о i present volume : since 
vl сощ that journal,” the bape t 
se of Professor Fisher's celebra i ingle point)—a method which he 
е novel i i ample of л variates as asi - 
Tought to Mean ee Е and effect on various problems of sampling. In the second of 
i 53 


* (differing from earlier methods 


Book Reviews 


rticles we һауе the origins of the z-distribution and of the analysis of variance. In a bio- 
Шеке ares (published in ‘Sankhya, 1938, and included in the introduction to the volume) Eor 
Mahalanobis discusses how Fisher’s early training and practice in the use of geometrical met Я 5 
may have been partly responsible for his subsequent applications of them ; but he offers no PF 
to why Professor Fisher should have chosen to work with 2, i.e., with half the natural /ogarit т 
of the variance ratio, rather than with the variance ratio itself, which was subsequently tabulated by 
Snedecor and labelled, by way of a pleasant compliment, with the initial F. 


polynomials as “а means of isolating groups of components susceptible of separate explanation.” 
This topic is then resumed in Papers 16 and 37 which deal 


may observe Pyschologists, in spite of the frequent o 
curves of work or learning, seem to have made little use of these methods. 
Papers 4 to 9 have chi-squared as a common them 
“ goodness of fit.” Before these articles were published і 
number of frequency groups as the quantity defining the 
test of significance, no matter what the form of the data. But here we are introduced to the now 
familiar concept of degrees of freedom. Paper 6 (1922) gives the di: 
and is particularly rich in its implications. The f test is showi 
regression coefficients ‘linear or non-linear, sim 
its general applicability was 


Paper 12 provides an be demonstration of the relations between the distributions of 
e 


t a ; and the next paper (14) describes the general sampling 
aol he multiple Correlation coefficient, Since very few statistical text-books written for 
psychologists offer a clear account of the relations between these different distributions, Professor 
Fisher’s systematic discussions are especially to be re : 


istori, `) provides a comparatively 
historical і B 
f Professor Fisher's o rical standpoint, contai 


wn views. Of the rera a MS no mathematics, eng 
interest to the Psychologist are num| - Of the remaining papers those of greates 
--а device which has already been bers 32, 33, and 34, which deal wit the discriminant function 


of articles published in th 
In vi its si. 
NODE Sua ben nU EOE. mains mega зсагоыу 2 considered expensive. It is primarily a 
understandi isti IS of the highest val i an 
ing of statistical methods rather than of mathematical statis, ^ ШУ eral 
54 


nd has formed the subject 


ee OCC CO oO 


È сень 


Book Reviews 


Progress in Clinical Psychology. Vol. I, Section 2. Edited by D. Brower and L. E. Abt. New York : 

Grune & Stratton. 1952. Рр. xxiv + 266. $5.00. 

In their preface the editors announce that they have planned to issue every second year a volume 
Summarizing the progress of clinical psychology. The first in the series, however, is intended to 
Provide “ as complete a coverage as possible of the past six years.” Section 1, already published, 
Presented “ an historical and systematic résumé of developments " and a detailed account of “ diag- 
nostic and therapeutic instruments and procedures.” Section 2 is divided into four parts, dealing 
respectively with developmental processes, applications to special areas, approaches to clinical 
Psychology, and professional issues. Each chapter is written by a specialist in the branch, and 
includes a detailed appendix of references. 


In studying the various summaries, the point that is most likely to: impress readers of this 
Journal is the way in which the current developments in America are tending to stress the value of 
Statistical research and particularly of quantitative checks on both theoretical and practical claims. 
As Professor Cattell observes, “ American clinicians seem now to realize that their best hope of 
making substantial progress in clinical problems lies in the development of refined statistical pro- 
cedures.” The longest chapter in the volume is Professor Kogan’s review of “ Recent Statistical 
Methods’ as developed for clinical work. He deals with four main topics. 


_(i) First, he tells us, “ one of the most active areas during recent years has been the attempt to 
devise systematic methods for handling profiles or score patterns in ways acceptable to the clinician. 
After reviewing the more important devices that have been proposed, he concludes that those likely 
to prove of the greatest service are the techniques of correlations between persons, of intra-class 
Correlation, and of multiple discrimination. One interesting application of the principle of * correlat- 
ing persons,’ the ‘so-called Q-sort technique, is being tried at the Counselling Center at the 
University of Chicago, suggested, it appears, by Dr. Stephenson: on the other hand, we are told, 

г. Cattell has criticized both the nomenclature of the method and its dependence on non-objective 
data," Dr, Cattell himself contributes a separate chapter on P-technique. Intra-class correlation 
has the advantage of well-established tests of significance, but seems to have been used less freely 
in America than over here. In illustration of the possibilities of the discriminant functions 

T. Kogan refers to work reported in earlier articles in this Journal, particularly by Lubin, Rao, 


Slater, and others, in discriminating neurotic and normal groups. 


(ii) A related field of work arises out of the need to develop special methods for validating tests 
Of personality, since those hitherto used with tests of cognitive abilities have been Son by 
many clinical workers as inappropriate. Several investigators, notably Zubin, pare pu: onward 
evices for validating projective techniques, while Cronbach has proposed a rather comp с 
Procedure (based on Vernon’s blind matching procedure) for validating qualitative assessments O 
Personality. 
ii i i i ion i develop 
. iii) A third li is attracting considerable attention is the attempt to develo 
Criteria for Pd E a Бы леті е.в., uS Bon ot Pere 
tsfeld’s theor re, and the various factorial methods for s ance 
Of the general pas ДС poro osn declares that he himself is prejudica UE prog 
Ing that the scaling approach (not necessarily linear scaling) will be found to Toe hen ode 
ms T€ adequately than the approach by multiplex testing, but the scaled compon 
*d back into reasonable models of human beings. 
Psychology ' (as well as other papers 


;, (iV) The 195 i * Research Design in Clinical a eolas 
cited by Profesor KOERD HS attracted increasing attention to the nee for m р к abis 
A vestigations in this field. Professor Kogan deplores tie E comparable to that of 

'Scover any comprehensive discussion of D RE of a control group has been increasingly 


Teenwood in th А = h 5 3 
i е field of sociological research. | | atment. But he considers it 
«081 f different modes of tre * substitute the multi- 


: 7 osal to 
ade of Fisher’s proposa for inductive generaliza- 


broader DE Fisher himself, the ‘ latest 
tions.” | ê À AT 
in thi . “Recent 
Mie Buen HAE been more promp Eon of analysis 4 d 
lev n 3 i er Integr: 
*lopments," so Dr. Kogan believes, “ portend а саи ke Rhetorical 


Containe 

On the on 

i > 

stock gators are far too ready to declare tl E 

Statistical techni yithout considerin 

t chniques, wi ПЕ od 

He Very method SBIEH their problem demands. Among т 1 
95 


Book Reviews 


istributi istical i " such as the * order 
s 5 іс ог distribution-free methods of statistical inference, 
а; of Wilks and the ranking methods of Kendall, rightly receive considerable stress. 


izati i i ter use 

i * P-technique Factorization,’ Professor Cattell begins by urging a grea 2 

of aon generally. Tor clinical researc he, Тоохан most usetul to be оа à 

analysis and the analysis of variance, since they do not require ^ СЫ бы ИЕ ашаа 

ifici: ation of all variables except one, which has prove 50 ( uc 

a Sealine HIE E Factor analysis he regards as “ the ideal device for carats the 
total personality”; and in particular, he maintains, “ the new сонар proce ure 

P-technique ! gives precisely an analysis and description of unique personal trai 2. аа psychio- 

In the remaining chapters the surveys of the literature on special branches о n. t feld by the 
logy may enable the reader to judge the uses made of statistical methods in his or па паана 
summaries they give of the latest results. Their general effect, it may be said, is A pcs dn the 
the need for greater caution in rd D he somewhat optimistic claims so often pu 
field of clinical psychology during the last few decades. "s : 

The publicntion of avs biennial surveys will be welcomed. in this country as warmly as: шце 
United States. It is therefore unfortunate that no very systematic attempt seems to 28) De. 
1o cover recent British contributions, especially as several of the methods and results Quos asn yy 
the American authors have long been accepted over here. There are generous ae erence S NORWOEE 
published in this Journal, but practically none to relevant articles in the other psycho оша і ural 
published in England. Nevertheless, in their selection of material and in their арраи dus us 
the general editors and the authors of the several chapters display a discernment t аб ДЕ one 
sound, critical, and impartial. Certainly, if we may judge by the two sections now ае s Шова 
will be of the greatest value to all students of clinical Psychology, and should be a challenge cane 
statistical psychologist to produce and apply more appropriate research-techniques in this import 
field of work. 


Cyril Burt 


The Young Delinquent in his Social Setting, By T. Ferguson. Oxford University Press, 1952. Рр. 
xii + 158. 105. ба. : 


In an earlier volume (The Young Wage-Earner, 1951) Professor Ferguson has described an 
enquiry into the after-histories of 1,349 Glasgow boy: 
present study forms a supplement to that broader sur 
of juvenile delinquency and to asce 
light on physical and mental factors, minor studies were also made of smaller groups from schools 
for the physically or mentally handicapped. 


. He finds that over 12 per cerit. 
crime before the age of 18, 


ility* : cf, Burt, C., < 
NE 1927.0: 60). Profess i 
term * O-technique? to Cover correlations between di 
convenient to adopt Burt's term P-technique to refer ee 
what he himself had elsewhere called * intra-individual coy. 
and Measurement of Personality, ) 


56 


aly thin a single person,” ton 
aration” (cf. Cattell, R., Descripti 


Book Reviews 


their new setting the families conform to standards of conduct prevailing amon ili 

present severely crowded.” But will they conform ? That sae the cae “By OE he fee 
chosen Professor Ferguson tacitly invites a comparison between his methods and results and those 
set forth in the volume on The Young Delinquent by Professor Burt. Now Burt has shown that, to 
a very large extent, delinquency is a by-product of a general inferiority, characterizing families at the 
lower end of the socio-psychological scale and apparently resting on an innate and inheritable basis. 
Intellectual, educational, temperamental, moral, and civic inferiority are all correlated together ; 
and such families tend to create or drift into the unhealthy overcrowded districts where delinquents 
are commonly found. Improvement in living conditions might do something towards solving the 
Problem, particularly in the case of those individuals in whom extrinsic conditions are more impor- 
tant than intrinsic ; but it could hardly cut down delinquency to the extent he implies. 

Professor Ferguson’s conclusions might be summed up by saying that, so far as the conditions 
Covered by his survey are concerned, the factors conducing to youthful delinquency appear after 
all to be much the same in Glasgow as in London and other large cities where surveys have already 
been made. It is, however, not altogether easy to decide how far his data really support the inferences . 
that he has drawn. To begin with, he ignores the somewhat special conditions which, according 
to other social investigators, appear to prevail in Glasgow. Burt, in his early studies at Liverpool, 
found that the amount of delinquency among children of Irish immigrants or of Catholic parents 
was disproportionately high ; and this was confirmed in a later study by Bagehot. Itis conceivable 
that a similar conclusion might hold good if the Glasgow figures could be analysed according to 
districts, Professor Ferguson prints figures for * Church attendance, but says nothing about 
religious denomination. And in general “ the sociological information which he gives is ” (as 
another critic has remarked) “ woefully inadequate." 

.Medical writers who deal with psychological problems 
Passion for statistical calculations.” Accordingly, one o A 1 n 

to bring out the relations (between crime and causal factors) without invoking the aid of elaborate 
Statistical techniques." Thus, when a difference is observable in the figures obtained from two 
Contrasted groups, he appears to accept it straight away if it agrees with his preconceptions (even 
though it is not statistically significant) and to ignore it if it does not. No endeavour is made to use 
the figures from the larger control group to eliminate the influence of the various conditions as 
Ound among the delinquent groups (e.g., by calculating coefficients of contingency or correlation). 


0 demonstrat idiary factors all that is done is to omit those cases in which the 
j e tho cE Der mewhat fallacious procedure. 


major factor is found, ai en recalculate the percentages—a so t 
In criticizing me shat sae Moped by Professor Ferguson in his earlier survey, Mr. Alec Rodger 
leclares that the writer “ has slipped on practically every banana-skin there was for him to slip оп”; 
and he agrees with the critic in The Times who complained that, in their discussions of delinquency, 
the investigators had constantly confused mere association with actual causation. The present 
Yolume endeavours, not (it must be confessed) with very great success, to circumvent these Guest 
eat Rodger points out, many of the difficulties have arisen Баша, rom the yey, онр s A 
ntial problems were carelessly formulated in terms very о b е. E 
as a result was somewhat amateurishly planned. Nevertheless, the tables contain much usefu 
Material. There is an obvious moral to be drawn : if those who are exper in medicine wish to 
Aunch out on enquiries іп the psychological and sociological fields, they s ould be urged to Se 
(О-орегаЦоп, or at least the Wes of those who are familiar with the innumerable pitfalls a 


tr i 1 
ained in the appropriate techniques. 


are fond of criticizing “ the psychologist’s 
f Professor Ferguson’s chief aims was 


Contributions of Psychology to Social Problems. By Cyril Burt. London : Oxford University Press. 
5s. 


953. Pp. 76 


Ses the beginning of this century, L. Т. Н 
ristened Всопотпісв and Political Science, wes ol va 
i D i e We ^ Bs 
Shiefly due, had tee cel Pee the older, speculative brand of Politi a тте Date 
scd Scientific discipline founded on objective methods of research,” particular and Francis Galton. 
“Every eed on first-hand data,” which had bees introduced by Сап tion or experiment, has to be 
lens 4 said Mrs. Webb, “ which is derived from, ор а ОБЕ like Hobhouse, 
Сайса, as well as toned bı ; relevant statistics.” ? Accordingly, заара mo аап 
i solves alas, and fh : jun McDouga I alor Pm b sic problems which appeared 
ti jd Selves to formulate. i the Е hape of tentative hypomecs the ba 
1 ONSE the scientific investigator in the Es B of work. 
2 ational Psychology, XXVI, 1952, рр. 221--- 
Beatrice Webb, My Apprenticeship, p. 349- ) 
57 


obhouse, the first Profes: 


d upon as the leader | 
Y iorum IO the establishment 0! 


Book Reviews 


is present survey, which forms the substance of his Hobhouse Memorial Lecture, Sir Cyril 
mare 5 summarize he chief advances made during the last fifty years in the many-sided attemp 
to solve such questions. One of his incidental aims, he tells us, has been to draw attention to BE 
contributions, “ partly because recent writers (who appear more familiar with American worl 1 
have so often ignored them, partly because it seems rash to assume that conclusions about So 
matters reached by investigators in other countries will also hold good over here," but above al 
because he believes that “during and since the recent war British psychologists, while devoting 
themselves more and more to practical applications, have tended increasingly to neglect research 
on fundamental issues." He points out that, although research in social psychology has been 
carried out far more extensively in America than over here particularly during the last two decades, 
nevertheless, from the time of Booth's surveys onwards, such work has exercised a far more direct 
effect on legislative and administrative changes in this country than in the United States. 


He begins with a brief account of the special scientific techniques—observational, experimental’ 

and statistical—devised for such investigations. Most of them, he claims, owe their origin to British 

` pioneers—Galton, Booth, Pearson, the Webbs, and Hobhouse himself. To a large extent, as he 

Points out, they were first tried and refined in the course of investigations on social and educational 
enquiries conducted with the aid of the schools, where material was more readily accessible, 


Turning to the results hitherto attained by the use of such procedures, he considers, in the first 


part of his review, what are the special mental characteristics of the human race that specially affect 
social organization and social relations, 


called social instincts ’ and * differences in 

discusses, in somewhat greater detail, the converse problem, namely, 

tion and relations on the mental characteristics of individuals, as 1 

munity. Here he recapitulates the main i 

the problems of sex, social class, and social attitudes and op’ 
The third part summarizes a number of typical enquiries, again based mainly on the combina- 

tion of case-studies and statistics, in regard to recent trends and 

occupations, and the like. The whole survey forms a well. 

the conclusions so far established, and closes wi 


" th a strong plea for systematic research in the social . 
branches of psychology comparable to that whic! У 


branches, 7.6. Mason 


The Causes and Treatment of Back 
CS Tey, VOREG A ness. 


By Cyril Burt. London: National Children's Home, 
Children's Ho; 


This is the most Tnt Of a series of books, 
с me, and dealing in i i i i d 
education. The Purpose of the present volume cus У explain te pecans іп child en h 


a an an i i d fo! 
tatistical verification. The d hee i 


Suggestions are made as to the w. bones live. Incidentally, useful 


of mathematical Procedures, 
Ir own schools, 


interest is perha 
With a concrete interest in hum; 


E. A. Gardner 
58 


— 24 


D а 


Book Reviews 


Facts from Figures. Ву M. J. Moroney. London: Penguin Books, Ltd. 1951. Pp. viii + 472. 55. 


Mr. Moroney describes his book as “ап attempt to take the reader on a conducted tour of the 
statistician's workshop ” : the visitor is allowed to watch experts at their task, and then encouraged 
to try his own hand. Those who are invited are “ not only students, but all whose work calls for a 
general knowledge of the subject, particularly in the world of industry and research." And they 
are assured that “ the rapid expansion and enormous success of statistical techniques in recent years 
Is ample proof of the need for such methods." 

. The publishers are certainly to be congratulated on producing a book of. nearly 500 pages with 
diagrams, numerous tables, and formula that must have cost the compositor some skill—all for the 
Price of two half-crowns. The introductory explanations are racy, lucid, and entertaining. Indeed, 
every chapter contains ingenious analogies or illustrations which will often prove instructive, not 
only to the novice, but even to those already acquainted with the stock procedures. Special attention 
15 paid to those branches that are applicable to industrial production and quality control. The 
more advanced sections are, on the whole, fairly up to date ; and the concluding chapters deal with 
Some of the most recent devices in the analysis of covariance, the study of time series, and the use 
ОҒ ranking, 

. Оп the other hand, many pages betray signs of haste or carelessness. The theoretical explana- 
tions are often inadequate, n^ SUIS would certainly be frowned upon by statistical critics. There 
are awkward gaps in the arguments, references to a diagram which does not exist, and numerous 
misprints, in addition to those contained in the publishers’ errata (which relate only to four out of 

€ twenty chapters). Some of the distributions are badly grouped, some of the percentages wrongly 
calculated ; and several of the answers to questions given at the end are incorrect. _No doubt most 
Of the slips will be corrected in a later edition. Meanwhile, as а semi-popular introduction ec 

00k should be of great help to those students who are unable to afford the most recent i -books 
9r who find the more academic expositions of mathematical procedures too dull, too d SUE ог 
Positively bewildering. iia 


59 


Vol. VI The British Journal of Statistical Psychology November, 
Part II 1953 ' 


THE RELATION OF THE TERMAN-MERRILL 
VOCABULARY TEST TO MENTAL AGE IN 
A SAMPLE OF ENGLISH CHILDREN 


By M. I. DUNSDON and J. A. FRASER ROBERTS 


Burden Mental Research Department, Stoke Park Colony, Bristol, 
and Research Department, Royal Eastern Counties Institution, Colchester 


I. Introduction. YI. Material. III. The Correlation between Number of Words and 
Mental Age at Fixed Chronological Age. IV. The Association between Number of 
Words and Mental Age. V. The Association between Number of Words and Chrono- 
logical Age. VI. Vocabulary Norms for the Mental Age-Levels of the Terman-Merril 


Scale. УП. A Sex Difference. МШ. Summary. 


I. INTRODUCTION 


The surveys in which we have been interested, including those now in progress, 
necessitate the testing of large numbers of subjects. ” For example, we are engaged 
in a study of low-grade mental deficiency, in which part of the plan is to test the 
brothers and sisters of the subjects. The most important initial question 1s whether 

de defectives, are on the 


ог not the sibs, other than the few who аге also low-grade, 
population : 1n such work tests are 


average of lower intelligence than the general 0 c с tests 
used for comparing relatively, large groups, and the interpretation of individual 
Tesults is not required. The limiting factor is the amount of testing time needed ; 
and the use of individual tests taking an hour or more to administer imposes 
à severe restriction on what can be done. As an alternative а vocabulary test is an 
Obvious choice. Since Kirkpatrick ((3), 1891; (4), 1907) first suggested a vocabulary 
test for children, much work has been done on vocabularies, and their merits have 
been widely recognized. Perhaps the best-known version is that introduced by Terman 
and Childs ((10), 1912) into their revision of the Binet Scale. Hence it 25 migos 
sary to do more than summarize their suitability for purposes, sich Ue s, and to 
indicate briefly the work in progress to which this paper is an introduction. 
1 ething very near it, is essential. 
Therein surveys onthi gind, compreesurement with which it i easier ro get a biased sample than 
Ose furni: i 1 $ res. Outside the limits of school age 1t 15 : ible 
ished by intelligence-test sco! D the subjects must D dE x 21 ek. But 2 


sibship. Hence group-tests are 


size of the the whole of the required 


1 EP: 

Ocabulary provides an individual test which is essential 

N is i ri 
to 52: It is known that the reliability of vocabulary tests is in Bene 
Average about -90. eee h 

bed in this paper the 

1 age is 84, a figure ve! 


3. Validity too i i 
n y too is very high. In 
umber of words and mental age at 


al quite high. Asa rule it appears 


partial correlation between 


the sample descri ry similar to those found 


fixed chronologica 
repetition of the same 


У Previous investi 
stigators. 1 f mere 
5 is minimal. The effect of n Scal 
ede n practice effect ÍS 39) describes a retest with fhe Stanford-(1916) Scale 
i 6l 


A 


The Relation of the Terman-Merrill Vocabulary Test 10 Mental Age 


of 140 children after the lapse of one day. The mean increase of I.Q. was 2-46 points ; but she states 
that in vocabulary score no change was observed. We are not aware of any work on the effect of 
attempts to coach children in the general task of word definition ; but it seems unlikely that this 
could have any appreciable success. The only factor likely to be serious would be practice in defining 
the actual words to be used. This can be obviated by providing alternative vocabularies, though of 
course much heavy work is needed to standardize a new one, 

5. Above àll, there is the great saving in testing time. A partial correlation of :84 between 
M.A. and words defined at fixed C.A. means that 70 per cent. of the information is being obtained in 
10 per cent. of the testing time. With a test as applied to an individual the nature of the 70 per cent. 
and of the residual 30 per cent. raises important problems ; but for the comparison of large groups 
this need not concern us greatly, though naturally one would be careful to consider the possible 
associations of so verbal a test with such factors as home background and social class. 

With these considerations in mind, one of us (M. I. D.) is administering four vocabu- 
laries to a relatively large number of children in the City and County of Bristol, the children 
being selected by a method which, it'is hoped, will yield a good random sample. The four 
vocabularies are the Terman-Merrill, the Mill Hill Vocabularies A and B, and the vocabulary 
from the Wechsler Intelligence Scale for Children. We are grateful to Professor L. M. 
Terman, Mr. J. C. Raven, and Dr. David Wechsler for giving permission for their use. 


We have, of course, no intention of standardizing the vocabularies in terms of anything 
other than themselves, and entirely agree with McNemar ((6), 1942) about the impropriety 
of attempting to convert vocabulary score into some kind of equivalent in mental age, and 
thereafter working out an I.Q. Nevertheless, it is valuable to be able to relate one of the 
vocabularies to a complete scale. We were also anxious to know how far the high average 
correlation taken over a wide span of age varied at different age-levels and whether it is 
justifiable to use a vocabulary as a single test down to the age of 5 years. Should there be 
a sex difference, as indeed we have found in our sample, it is necessary to know whether and 
to what extent mental age is also' involved. Vernon ((12), 1949) has published results based 
on a sample of English children, showing that their vocabulary scores at most of the relevant 


mental ages considerably exceeded those required for a pass. Е. use 
we had a sample already available, s pu e eR ui er 


we thought that it might be d data 
for yet another sample of English children. i = LO put OD ae 


ІІ. MATERIAL 


The main part of the data consists of th 1 i i Е 
Merrill test (Form 1) to 450 children Re HOR O ары шы Ше Termit 
was assembled for another 


used, At other parts 
from those available. 


t 


M. I. Dunspon and J. А. F. ROBERTS 


III. THE CORRELATION BETWEEN NUMBER OF 
WORDS AND MENTAL AGE AT FIXED 
CHRONOLOGICAL AGE e 


. Using the smallest possible units of groupin i 
i 1 g, months and sin le d: =; 
correlation coefficients are as follows : C.A. and M.A., -8000 ; С.А. and SEN 4949; M. va 
s words, -9307. The last figure is véry close to Raven's ((7), 1948) correlation of -926 
ween Mill Hill Vocabulary score and "Termah-Merrill M.A. The relevant figure is the 


partial correlation of M.A. and number of words at fixed C.A. The influence of chronolo- 


gical age can be appropriately removed by the use of partial correlation as it is a measurement 


not subject to error. It is true that the regressions are not linear, but as the linear functions 
ud for the great bulk of the associations it is not misleading to use them. Furthermore, 
те non-linear components іп ће regressions involving С.А. аге almost entirely due to the 
-Уеаг age-group containing too many children of high intelligence, as mentioned in the 
PIOUS section. As regards the association of M.A. with С.А., the expected falling off in 
е rate of increase of M.A. after the age of 12 is shown by our sample, but is not significant 


pu these numbers. The partial correlation is .842. A part of the association is spurious 
ecause vocabulary is used in scoring M.A. ; but this part cannot be large, as the contribution 
levant mental ages does not average more than 10 per cent. 


of vocabulary over the ге! 
The correlations between M.A. and number of words, calculated separately for each 


year group of C.A., are shown in Table I. 


TABLE 1. CORRELATION BETWEEN MENTAL AGE 
AND VOCABULARY AT SUCCESSIVE 
CHRONOLOGICAL AGES 


Chronological Age Number Correlation 


other workers. Terman and Ме 
ry and the scale as a whole for single age-groups 


McNemar ((6), 1942) gives correlations with the 
14, and 18 respectively. 


T 3 
stie, gres are closely similar to those of 
range fi at the correlations between vocabula 
com тот ‘65 to -91 with an average of ‘81. 

posite scale of -71, -83, 86, and :83 at ages 8, 11, 


N BETWEEN NUMBER OF 


IV. THE ASSOCIATIO 
MENTAL AGE 


WORDS AND 


n now be examined in more detail, 
5 clearly non-linear. Perhaps some 
hat it is helpful to obtain smoothed 
as in this instance, with 


The association of M.A. and number of words cal 
The relation 1 


th h 
€ basic data being shown in Table П. 


apology i : t 
sn Y : it seems to 15 
eeded for curve-fitting ; but DS to be very good, 


Eu: 
rves by efficient methods. When the fit pro 
63 


AI 


COE oP ETC 505 S8 CELL "ЕТ: TEL OUNCE CC LO CCP 60 8с EC Strayer en tas OC LC EIE Cee A N o [EOL 


a i z | Ire -o£ 
L СТСТ 2 ІГ? -0? 
0с Д! t TIS -0'6 
oE I Ol Ве ср І ІГ9 -07%9 
6t ۰ Soe. Ole 9 DET LEL: 02 

cs CTS 00; СОЛЬ” ІГ8 -08 

15 ТЕ FOSS MLNS CL. SEIIAS ІГ6 -0'6 ' 
52% | £L РЕ Xemecr жЕ | ІГ0І-001 
v9 | I ІБ 0: 652551 OD CTS! [22-] І ІГІТ-ОТП 
96 Pi) mee TORSO LIL бр” ЕТІ I IV'cI-O'cI 
6E [* € ls OUR L og RB, E Il'£I-0'€I 
£t ОАУАНСы e 8-65 26-1 ІГРІ-ОТІ 
6c о аР pas Crab) epee Cece | I ! ІГСІ-0761 

SI I Гу Су 241980 I ІГ9І-091 

Tl Аб I | [see oe С” IT LI-O'L1 

t I I I 7 ІГ8І-0%81 

I seal ІГ6ІСО6І 

а E EAE А9 ССС LLE E AEE Ole Slee Li ON СІБІР СІЗ UCT SLE OD GO SCE TON SI Sp. £06 :L»0 o8y 

& `: spıoM JO loquinN| : Kre]nqv20A ' | TWW 


SAYOM JO UAGWON ANV ADV TIVLNAW МОЯ STIS VIL АОМЯПОЯЯЯ ALVIAVAIA ‘TI ATAVL 


The Relation of the Terman-Merrill Vocabulary Test to Mental Age 


.. -- 


M. I. Dunspon and J. А. F. ROBERTS 


entirely non-significant remainders which are i 3 that 

ly nc der: close to expectation, then one can feel th 

cata Dave been reduced to their simplest form, a form which permits trends to be BL E 
scured by chance fluctuations. It turns out that biquadratics are needed for both 


regressions, the equations being 4 
y = — 12:64 + 6:941x — 0:9849x? + 0:06982х° — 0:001664х!, 
where у = number of words and x = M.A. (3.0 — 3.11 = 3, etc., . . .): 
x = 2:882 + 0-2696у + 0:2675у° — 0:02691у° + 0:0008121y*, 
Mere x > М2. аѕ RUE and y — number of words in units of grouping (0 — 1 = 1, 
The fitting is shown in Table III and the curves are shown graphically! in Fig. 1. The 


TABLE Ш. RELATION OF MENTAL AGE AND VOCABULARY : ANALYSIS 
OF VARIANCE 


; A. Regression of Vocabulary on Mental Age 


1 Degrees Sum of Mean : uu 
Variation of Vocabulary of . Squares Square E DEN 
ШО о Freedom | (words) (words)? қ 
pincar regression.. — .. 13,771-94 | 13,771-94 SIE 
Addition of quadratic term 151-95 | 151-95 | 3519 | ++ 
Addition of cubic term .. | 3:79 379 | 088 - 
ddition of biquadratic term 58-32 | 5832| 13:85 im 
emainder between arrays 1 57-02 4-75 1:13 - 
thin arrays a 485 2,035-47 4-20 | 
Total 16,078:49 32-09 


. Regression of Mental Age on Vocabulary 


a ee ---------- 


Sum of Mean Variance Signi- 


Vari mae 5 Square | Rati fi 
ariation of Mental Age о quares 2 atio cance 
ears)? | (years) 
= Freedom | (years) 
Adds regression. . Е 1 4,356:17 4,356-17 44-82 Wt 
Ion of quadratic term 1 0:32 0:32 0-22 — 
l 7 5:29 db 


dition of cubi ole 
Additi ubic term .. eT | 
emainder piquadratic term 7:46 
TES е 
Ithin arrays = arrays „А 2 53 688.09 1-41 


u : ; - 
sped atic terms are relatively large in both instances. The bia aaa variance between 
camel improvement, wh een ean Ші n 
cut down to a half and with the other regr ‘tions of Table I. e 
t on the age-group corel ative flattening 


he form of the curves throws li 


most im t ages 
and +. Portant feature is the steep rise at the youngest 597 | age 5, the fall to a minimum 
then by a steep rise once again. The high correla re just what would be anticipated. 


ata : 
Ве 7, and then the rise almost to the maximum by a£ 


1 
уу ; : 
e are indebted to Mrs. M. G. Young for drawing the figure. 
65 


The Relation of the Terman-Merrill Vocabulary Test to Mental Age 


It is not a matter apparently of vocabulary being a less meaningful test for the youngest 
children, either necessarily so, or because the lower ranges of the scale are less verbal. Rather, 
it would seem, among young children the task is such that vocabulary correlates very highly 
with other mental measurements; then there is an interyal during which the number of 
words defined becomes rather less meaningful ; finally, by age 9 or 10 vocabulary has regained 
its close association with other tasks. 


30 

az Regressions p Ж 

26 Vocabulary оп Mental Age ps 
Mental Age on Vocabulary ----- / 

24 | l / 

22 


16 


Vocabulary — Words 
> 


> 7 9 1 13 15 17 19 
Mental Age —Years 


Fic. 1.—Regression of number of 
words on m 
of e and regression of mental age on number 


V. THE ASSOCIAT] 
ON BETWER 
N NU OF 
Tabl Mist d CHRONOLOGICAL СЮ 
able IV gi scree 
The сано EA frequency table for number of i es 
“A. and vocabulary is so euch fees and chronological a£ 
66 ; бег than that between M. 


"M. I. DUNSDON and J. А. F. ROBERTS 


67 


ISSOE9ESSESPISIBEBEIMOL (Cle CC IEG LO CO EOC SCE SES РЕ “ӘР? "ЕС OG tha tle СТЕР TTC [EOL 


gI 3 Is Ge POL Ss CASAS Tl SOS 
ÞE Г. Тере Spe ST SC CST 1179 -0'9 
Ov Ks, Se "6 A COLES EEL -OL 
9t А САС tC haere 0ТЕ У, IVs -0'8 
рс I | СЕС И V: 93579: 16 COLS CUT 11'6 -0'6 
19 I Тіс б» Сасы тар 95 015287 E lh Le Cele ІГОТ-001 
19 Gu cene bio CV COM 1650-59 ep c. aT ot] IU TIO H 
$s КҮЛЕ” oem Canc ie) КОСА ИЕ Sete avi SONT ve Е IUTI-O'TI 
$9 Coat Cen a LAARA ty ГР ОТР: 81077102121 I IVet-o'el 
LS I Ер ее ee A) EDI A (4 2 ПІ IV v1I-0 tI 
г | 6% 8% 1С 9% sc vc Ұ< (С. Ie 026 BT LT. OF ST HE Ef CL по 6 8 L 9 S v tt TO Sty. 
ta Sp1OAA JO JoquINN, : Á1ve[nqv20A | ME 


ja 


Ѕачом AO UAHNNN аму 39V TVOIOOTONOUHO AOA ATAVL АОМЯПОЯМЧЯ ALVIYVAIA "АІ WISVIL 


The Relation of the Terman-Merrill Vocabulary Test to Mental Age 


: i : ei int i ini he 
bulary that with our numbers there is not much point in examining the form of t 
de ides. particularly since in this sample, as already mentioned, the great bulk of ІНЕ 
departure from linearity is due to the accidental circumstance that too many clever 11-year-o 
children were included. More ample data from a better sample will be forthcoming when 
the main enquiry is completed. The difference in appearance between the DO Di 
frequency tables (Tables П and IV) is, however; very striking. The growth in variability o! 
nümber of words seems largely to keep pace with the growth in the variability of M.A., but 
in terms of C.A. the variability of the number of words defined increases very markedly with 
age. It will be necessary to allow for this in the age standardization of the vocabularies. 


VI. VOCABULARY NORMS FOR THE MENTAL AGE- 
LEVELS OF THE TERMAN-MERRILL SCALE 


As regards total score there is reasonably good agreement between the norms given by 
Terman and Merrill and the results obtained with British children. Nevertheless, many 
individual items do show differences, some being placed too high for British children and 
Some too low (Burt (1), 1948). The standardization of the vocabulary test has been criticized 
by several writers as being much too easy. Vernon ((12), 1949, p. 184) gives a table of norms, 
based on a preliminary research, which differ widely from those given in Measuring 


Intelligence, Table V compares the norms derived from the present sample with those of 
Terman and Merrill and of Vernon. У 


TABLE V. NUMBER OF WORDS CORRECTLY DEFINED AT 
MENTAL AGE-LEVELS IN TERMAN-MERRILL SCALE 


Ce EEE eee 


No. of Words Correctly Defined 
Mental А, 
шасе раа Vernon’s Present 
Sample Sample Sample 
oh 2 As 23 Ex 5 5 6 
n ер, A - 8 9 
10.. M 55 E 11 13 i 
12:. 5% oe = 14 17 14 
14.. үз ee = 16 21 18 
15.4 (А.А.) ae ae 20 24 21 
17.4 (S.A. I) ot 22 23 29 25 
OSA 2000 26 36 = 
2210 (SA. II i — 7: 30 42 = 


68 


M. I. Dunspon and J. А. F. ROBERTS 
VII. A SEX DIFFERENCE 


. The present sample shows a small but highly significant sex difference. The means and 
linear regressions are as follows : 


Mean Mean Mean Increment in Increment in 
No. С.А. М.А. No. of words per year words per year 
(years) (years) words of C.A. of M.A. 
Boys zs 260 10:635 10:881 13-150 1-674 1-681 
Girls o5 242 10:706 10:940 12:665 1:547 1:627 
Difference .. —0:071 —0:059 1-0:485 40:127 + 0:133 = . +0:054 + 0:058 


and С.А. to 10:80 years, the sex differences in 
d as at that age are : C.A., 0-616 + 0343; M.A., 


Using the regressions to adjust M.A. 

favour of boys in number of words define 
0-575 + 0-186. Thus, for a given С.А. or M.A., boys define more words than girls, the 
difference as at 10-80 years being about 0-6 word. The difference is not significant for C.A., 
but, owing to the close association of number of words and M.A., is highly significant for M.A., 


It would seem to extend over the whole range of age, 
2 f the regression lines and not of their slope, though 
it is possible that the rates of growth might be significantly dissimilar with larger numbers. 

Tt will be noted that it is vocabulary which is slightly sex biased and not mental age 
as a whole, In another British sample, that of the Scottish Survey ((7), 1949), boys did better 
on the Terman-Merrill Scale than girls ; but that sample covered only a narrow range of 
age. In the present sample, with its 10-year span, there is no trace of sex bias with the scale 
as а whole. Girls are 0-071 year older than boys and their mean М.А. is 0:059 year greater, 
a difference of one іп the first decimal place of Г.О. Agreement between the sexes could 


being 3-1 times its standard error. 
and is apparently a matter of the position о 


~ hardly be closer. 


VIII. SUMMARY 


` 2 | аре must be 
1. Intelligence-tests intended to cover the whole range of schoo 
applied individually. For the comparison of large groups vocabulary test heye 
manifest advantages in respect of reliability, validity, relative a E 105 
effect, and, above all, in the amount of information obtained per unit A я) ngame 
Accordingly four vocabularies are being given to school hilar Т m etat aper 
by a method which, it is hoped, will give а good random Set ЗБ КЕС Ge 
presents data from a sample of English chion showing the relation © 
errill vocabulary test to the scale as а whole. М И 

2. The ABÊ mera of results obtained by applying Form ШІ 19,120 CAE 
tangta in age om 70 O Ll Jnr Ве eredi don of, TO 
50 as to yiel. ith a mean Г.О. o an 0 
them RE на 52 children aged 5.0 to 6.11 years, who were slightly lower 
In I.Q. and slightly less variable. t fixed 

3 The ul аа correlation between number of колік елі; MS ae 
С.А. is -842, which is in close agreement with the E testing time. 
70 per cent. of the information is obtained in 10 words are non-linear. 

4. The regressions of words on М.А. and of MA онл number of words 
The form of the curves is reflected 5 ae Hm tit, “hese being respectively 71, 
and М.А. in the seventh, eighth, and ninth уе from -82 to :90. 


60, and -72. Тһе other age-group correlations range 
69 


The Relation of the Terman-Merrill Vocabulary Test to Mental Age 


i f the scale agree 
. The number of words defined at the various M.A. levels о 
ds Em with those of Terman and Merrill up to 12 years. At older Es tbe 
children of the present sample exceed the norms given-in Measuring Intelligence, 
but are closer to these norms than to those given by Vernon. 


6. In this sample there is no sex difference in performance with the Terman- 


Merrill Scale as a whole. There is, however, a sex bias in vocabulary score ; for 
given M.A. boys define more words than girls. 


REFERENCES 


5 an Ordinary Vocabulary.’ Science, XVIII, 107-8. 
. Kirkpatrick, E. (1907). ‘A Vocabulary Test." 64, 


- McMeeken, A. М. (1939). The Intelligence of a Representative Group of Scottish Children. 
London ; University of London Press, 


. McNemar, 0. (1942). The Revision of the Stanford-Binet Scale. Boston В Houghton Mifflin. " 
- Raven, 1. C. (1948). “Тһе Comparative Assessment of Intellectual Ability. Brit. J. Psych., 
Gen, Sect., XXXIX, 12-19, 


. Roberts, J. A, Е. (1952). “Тһе Genetics of Mental Deficiency.’ 
- Scottish Council for Research in Education (19 
University of London Press. 3 inet 
Terman, L. M., and Childs, H, G. (1912). “А Tentative Revision and Extension of the Bine! 
Simon Scale.’ J. Educ. Psychol., ІП, 205-8, 
- Terman, L. M., and Merrill, M. A. (1937). Measuring Intelligence, London : Harrap. 
- Vernon, P. E. (1949), The Measurement of Abilities, London : University of London Press. 


Mo мо ww N= 


Eugen. Rev., XLIV, 71-83. ‘ 
49). The Trend of Scottish Intelligence. London : 


5 


Rm 


70 


r 


' . Vol. VI The British Journal of Statistical Psychology . November, 


Part II X 71953 


CHANGES ІМ TERMAN-MERRILL 1.0.5 
WITH DULL CHILDREN 
A Test of the Roberts-Mellone Adjustments 


Ву ELIZABETH H. SCARR 
Research Department, Royal Eastern Counties Institution, Colchester 


I. Introduction. Il. The Data. ІШ. Comparison of I.Q.s іп Test and Retest. 
IV. Comparison between the Observed Mean Changes in Г.О. and those Predicted 
by the Roberts-Mellone Adjustments. V. Summary. 


I. INTRODUCTION 


In a recent paper! Roberts and Mellone have devised a method of adjusting 
Terman-Merrill 1.Q.s for their lack of independence of chronological age. They 
describe the fitting of a curve for regression of variance of Г.О. on C.A., obtained 
by using the test results (Form L) of the 2,000 children in the Terman-Merrill sample 
whose ages range from 5.0 to 14.11 years. From this curve is derived a table of 
adjustments, so that the adjusted I.Q.s become comparable at different age levels. In 


their paper Roberts and Mellone say.: 
f o a maximum of 54.at 5 years 


“An 1.0. ay be expected to rise t 
7 months ы cpu АН sep fall i about 2 points a year to age 8; thereafter 
fall steadily at the rate of 4 points a year to 11 years 1 month; remain constant to 13 years des ; 
and finally rise to 49 again. From 14 years 7 months or so the rise becomes apple ly ІСЕР 

ether in fact these actual changes will be shown on large samples can only be determin y 

repeated measurements of the same individuals." | Й 1 4 а 

Data for making this practical test have since become ayailable in connexion yir 
retests made on a sample of mentally defective and backward children as part 9 а 
Study on mental growth at low intelligence levels. As the вај саада EE S i ga 
variances, their average size in any given group increases with : d m low LO 
group from the mean of the general population. In а group of rear a4 AN 
the adjustments are large; and consequently the sample thus examined р 


Suitable material for making the test. 


II. THE DAT A 

i hildren 
The sample consists of a group of backward and mentally, Без) СШ шеп 
attending Special Schools for the educationally subnormal any LO cae 
Resign nmstord, Colchester, DILE Тау ое Institution in Colchester 
€sidential Special Schools of the Royal Eastern Combe soe EHE. a 
and Cambridge. All the available children in the selected е ора а е 
£0 and 14.11 years, were tested by means © orm L. ee ае Sud 
saans were not included, partly because relatively few " cin are ues 
Ss i с аг i , after the 
able Звала Occupation G dis Pole, the same children Me са 

lapse of one MM: оошу was encountered in obtaining 


1 Om t 
his Journal, V (1952), pp. 65-79. 71 


Changes in Terman-Merrill 1.0.5 with Dull Children 


interval, the main obstacles being absence through illness and the intervention of 
school holidays. About 10 per cent. of the original group could not be retested on a 
calendar date sufficiently close to that of the original test ; and these were omitted. 
The complete sample consists of 402 subjects, the age range at first test being 6.0 to 
14.11 years. All tests and retests were made by myself. 

For testing the adjustments only those children aged 13.11 years or Jess at the 
first test could be used, since the correction table for Terman-Merrill 1.0.8 gives 
adjustments to age 14.11 years only. This smaller group consists of 350 subjects 

- Whose ages range from 6.0 to 13.11 years at first test. The I.Q.s range from 20 to 
85 at the first test, with a mean of 56-97 and a standard deviation of 14-12. 

The intervals between first and second tests varied about 365 days in accordance 
with the following distribution : 4 0-7 days, 207 children ; + 8-14 days, 79 children ; 
+ 15-21 days, 41 children; + 22-28 days, 23 children. 


ЕШТЕ COMPARISON OF I.Q.s IN TEST AND RETEST 


TABLE L FREQUENCY DISTRIBUTION OF DIFFERENCES IN LQ. BETWEEN FIRST 


AND SECOND TESTS 


Ist LQ. « 40 % 1510. 
2nd 1.0. eS ET 
minus C.A. a 1 
Ist 1.Q. АП С.А, a ea | ae нш 
6.0-10.11 11.0-11.11 12.0-13.11 
+10 2 2 
FO 1 2 3 
+8 2 2 
РЕ i 1 1 
35 3 1 1 5 
+ 2 2 4 2 8 
TÉ 3 6 1 6 16 
F 5; 5 12 20 
uS 1 3 1 m 21 
+ 3 2 
و‎ 12 9 6 8 35 
TIN 8 9 3 15 35 
DR а 16 5 12 38 
E 7 18 4 10 39 
10 4 9 25 
= 9 3 9 
pig 5 4 7 23 
TG i 10 3 D 17 
ar 5 2 2 10 
Bs 7 2 2 11 
-10 i 1 1 5 
-П 1 ! > 
Ег н І 2 
1 
3114. 1 ( 
1 
Number 57 136 4 
Mean ^ - 0:5 Er 42 П 0 
Standard Error a 2101 ~ 2976 = 0357 Tra 
9f Mean 0:403 0:355 0:575 0:367 0:213 
DUO actual number of children was 339, 


339 too few 
alysis is made ІП terms of the 350 one-year е 5 


72 


- Е. Н. SCARR 


Table I shows the frequency distributions of differences between test and retest. 
(а) The mean difference between the two tests, for the whole group of 350 subjects, 
is —1-417 + 0213; the standard deviation of differences is- 3-977. (6) The mean 
difference for the whole group, ignoring sign, is 3-320 +: 0-139, with a standard devia- 
БО oF 2-604. (с) The coefficient of correlation between first and second I.Q. is 
m rae Е 

The grouping of chronological age and I.Q. used in Table I anticipates the findings 
of the next section. It will be seen that this grouping shows most clearly the trends 
of agreement and disagreement between the observed changes in I.Q. and those 
predicted by the Roberts-Mellone adjustments. 


Iv. COMPARISON BETWEEN THE OBSERVED MEAN 
CHANGES IN I.Q. AND THOSE PREDICTED BY THE 
ROBERTS-MELLONE ADJUSTMENTS 


change in І.О. during the year between the 


two tests was as follows. The І.О. obtained in Test 1 was located in the correction table, 
Table УШ (loc. cit., pp. 72-3), opposite chronological age for Test 1, and the adjust- 
ment at the head of the column noted. Then the same obtained I.Q. at Test l was 
again located in the body of the table, this time opposite chronological age for Test 2 
(i.e., one year later), and again the appropriate adjustment at the head of the column 
was noted. The difference between these two adjustments gives the predicted rise or 
fall in І.О. over the period of the one year between the two tests. For example, 
for I.Q. 63 at age 10.0 years, the correction is -+ 5; for the same Г.О. at age 11.0 years 
it is -- 7. Thus the predicted fall in І.О. over the year is 2 points. We have, then, 
for each subject the observed change in I.Q. and the predicted change in LQ. By 
comparison of the two the applicability of the adjustments can be tested at the various 
,age and І.О). levels of the 350 subjects. 


The observed and predicted mean changes in I.Q. are shown in Table II. 

i i i i i indi that there must be 

Theoretical considerations as well as practical experience indicate с 1 
a point in intelli i rly defined one, below which adjust- 
Point in intelligence-level, though not necessarily Dee ya EE preset sample an iri 
of about 40 hits off this point very fairly. In comparing the observed and predicted mean 
E i eans shown in the right-hand: 
changes, therefore, I.Q.s under 40 have been omitted from ue d оа Ше dS 


The method of finding the predicted 


of differences is shown for all I.Q.s under 40. а 
From inspection of Table II it can be seen that the agreement Beate pere : A > 
Predicted changes in I.Q. is, on the whole, very good u 
test. As the predicted change in a child of 10.11 Jio 
11.11 years, this means that the adjustments are арр ica 
ur 12 years. For this span of chrono 
ed ci i C.A. predic! t А З 7 
of Lo The stay Gates DS the data. During the thirteenth year ae Eon ss 
à period of stability, during which little change 1 teenth and fifteenth years, the 
owever, show a continued fall in ГО, Du Cot As the variance decreases, 
[o Mellone curve predicas a ay lecum show that the levelling off, m was 
‘Qs rising.) Instead о Я = d fifteenth years. 
Pr ei 5) ар то is in fact taking place in the fourteenth and 1 
73 


ears, the Roberts-Mellone 
in I.Q. over the lower ranges 


Changes іп Terman-Merrill І.О. with Dull Children 


-= 45559499 
о CC 1 61 ہک تہ ت س‎ 
2 a m ІГІРІ!--- 
с” 
E: DA) RR ео 
2 s O| -^ó-adaàdaóoo: 
се o СІКІГІМІГІКІ? 
5 E 
© 
о Z| 99553955 
2 Y| comnomem 
m a “NN ноо еҷ 
= Less |А 
өз ж 
3 e 8 Ф тоо с Ог = 
^ ЛЕО ОЧ се с ч 
n b EHE AME 
24 б 
m 0n ерее 
Q 
n EN 
9 3 mor -meoo 
z È C CI C1 C1 £4 че 
blot d pee 
a» т 1 
E [ 5 جا ب ب ب‎ 0 O en =m 
Qc | Plt ddd t+ 
щ 8 б v: 
= 5 - Z OO DESC Eti op 
HAM T | 9 
a2 | = 3 Qoo | 9 
98 e È (e N ч © m t us 
9% est | Кж ИИА = 
Ш о v : 
Û iy Al 8 OQ'ounmododo Z 
50 all felt +e 2 
е шт 
оО d 5 Ч 00 О N O ш r- 3 
о 
шщ Р e 
> Y NO>Samno | = 
m t thno |% 
2 a EE MESI] ге 
wT 
o e 8 9999m in c0 
ee Q CUN Ч mood 
O Т E 
m б 
Un . 
Z &| ەە‎ 
5 S Ы ИТИИ ЕЕ 
< TS КІЛІ 
H б 
ra E ҰЛЫН 


TABLE II. 


74 


E. H. SCARR 


TABLE Ш. SUMMARY OF OBSERVED AND PREDICTED MEAN CHANGES ІМ LQ. 


4 С.А. at Ist test 
ГО. at ist للل‎ 
test 6.0-10.11 11.0-11.11 12.0-13.11 
No. Obs. Pred. No. Obs. Pred. No Obs. Pred 
40-49 16 — 1:8 — 2:6 3 - 27 -0:3 17 —02 +32 
50-59 35 -09 = 5 16 -19 -01 32 +10 +30 
60-69 48 — 2-7) RES 12 -36 — 0:3 45 -07 -22 
> 69 37 —29 -ІЗ7 11 -36 —0:3 21 —18 +14 
Total 136 -219 — 2:08 42 —298 — 021 115 -—036 + 2:44 
Std. Error of 
Obsd. Mean + 0:355 + 0:575 + 0:367 


— | 


Table III, which is a summary of the results shown in Table II, clearly shows these 


trends. 


In their paper, Roberts and Mellone also describe corresponding calculations carried out on 
test results foni a sample of 689 children between the ages of 5.0 and 14.11 years, pone the sibs 
and cousins of a group of mentally defective children. This sample was obtained from areas 


aro i ‘th the Terman-Merrill sample was close except 
around Bristol and Colchester. The correspondence WI rman ME sample, was included 


in the oldest chi . rnative series of multipliers, based on | L 
or CORTA COR I their e VII. It seemed worth while calculating the changes of IO. prO, 
On the basis of this latter sample. It might have been thought that the observed antes p 
Would correspond more closely with those predicted on the basis ofa UE 19 8615 £ та гат 
is was not, however, found to be so, as is clearly revealed in Table IV. Wi T es li гу x ер ion 
Of ages 9.0 to 9.11 years, there is closer agreement with the changes predicted on the Das! 
erman-Merrill sample. 


N CHANGES WITH THOSE PREDICTED 


TABLE IV SERVED MEA 
. COMPARISON OF OB L AND ENGLISH SAMPLES 


ON THE BASIS OF TERMAN-MERRIL 


Mean change 


wol Wu Is 
on | | Predicted 

A. at Ist test Number Observed Terman-Merrill English 

Sample Sample 

> ier wm ce SE RR 2569.25 — 3-13 

6.0— 6.11 8 == 1:75 A 2:81 = 3:44 

7.0- 7.11 16 — 0:94 075766 — 3:00 

10.0-10.11 44 — 2:59 — 0:21 -- 0:26 

11.0-11.11 42 — 2:98 4 1:09 + 2:36 

12.0-17 — 0:67 3:82 4- 8:04 

12.11 58 0-04 zi 

13.0-13.11 57 P ; 


ined 
-o in the test on the second obtaine 
AN | 1 he effect of praci ai the group consisted of 
cn AS e rena betwee Te and Tes was one Уват racic effect was n 
: i : i assu разе У ESTO: 
iue DE inel ers it has ve expected it to manifest itself in the hig 
ranges, Тһе TERI however, show no evidence for this. ^ 


س 


Changes in Terman-Merrill I.Q.s with Dull Children 
V. SUMMARY 


1. In a recent paper Roberts and Mellone (1952) examined the relation of 
variance of Terman-Merrill I.Q. to C.A. and devised adjustments intended to correct 
I.Q.s for their lack of independence of chronological age. 

2. The retesting of 350 mentally defective and backward children after the lapse 
of a year enables a test to be made of the validity of the adjustments. The children 
ranged in age at first test from 6.0 to 13.11 years, with a mean I.Q. of 57. 


3. In children of all ages the results show that the adjustments are inapplicable. 


to I.Q.s below a level of about 40. 


4. With I.Q.s of 40 and over the changes predicted by the adjustments are closely 
realized up to the age of about 12 years. 


5. During the thirteenth year the adjustments presuppose a levelling off, whereas | 


in the sample the mean I.Q. continues to fall. During the fourteenth and fifteenth 
years the adjustments presuppose a rise in І.О., whereas in the sample there is а 


levelling off. Thus in this sample, apparently, the curve of variance of Г.О. on G.A: E 


is inflected about a year later than was predicted.! 


nham, Leytonstone, and Walthamstow, and x 
entres of the County of Ess¢ ч 
. M. Bates, Dr. J. A. Fraser Roberts, ап 


76 


—_ = 4 


Vol. VI The British Journal of Statistical Psychology November, 
Part II - 1953 


THE SUBMISSIVENESS OF NATIONS 


By LEWIS F. RICHARDSON 


Hillside House, Kilmun, Dunoon, Argyll, Britain 


I. Introduction. Yl. The Linear Theory of Arms-races. II. First Hypothesis 
about Submissiveness. ТУ. Second Hypothesis about Submissiveness. V. Third 
Hypothesis (for Sides Equal in their Constants). VI. Conclusion. 


[INTRODUCTION 


The problem to be discussed in this paper can be stated quite simply as follows. 
It is well known that a thorough military defeat leaves the vanquished nation for 
some years in a mood of submission to its victors (examples are the submissive 
behaviour of Vichy-France to the Germans during 1941 to 1943, or of West Germany 
to the Western Powers and of East Germany to the Soviet Union during 1945 to 
1952). As this effect seems universal, it is not necessary to name the nations ; they 
may be distinguished as A and B. Now suppose that A and B are both rested from 
war and are roughly equal in size and in industrial power ; and suppose further that 
A makes a show of force against B. Does such behaviour induce in B a submissive 


mood or a retaliatory mood ? 
Whatever may be thought of the following treatment, there can be no doubt that some of the 
questions connected with submissiveness are important. For example, In September, 1952, the 


North Atlanti reanization held combined naval manœuvres (exercise “ Mainbrace ") 
AA Отвар with land manceuvres (exercise "' Holdfast ”) involving 


off Norway and in the Baltic simultaneously 
nearly 200,000 troops in Western Europe. The declared intention of N.A.T.O. was to derer any 
tendency in the Soviet Union towards aggression. On the contrary the Russian new pi ere гауаа 
said that exercise “ Mainbrace " was itself aggressive (B.B.C., September 13th). Personally. accept 
N.A.T.O.’s description of its intention ; but I appreciate that Pravda may find ШЕН НЕ 
difficult to believe, because of the universal tendency to suspect foreign forces unless those О! aes, 
a tendency represented in the following formule by the defence CofC ore 
from any tendency towards aggression is here regarded as a mild form of su mis: SN Ono 
Professor C. A. Mace asked whether my paper contained any SURE EIUS Or Probably E dê 
groups of people, a type of enquiry now being pursued by many psycho os" dividuals meet. For 
for submission has been occasionally observed wherever two or more im LUE ЕЕ 
examples: even between wolves (Lorenz (14), 1952, ch. 12); ог ur ein Hoir Tenon tO 
yard (references are given by Landau (11), 1951) ; oF tests on individua. P: 5) 62-6) took self- 
the community (Allport (1), 1945, pp. 410-14). W. McDougall SES x ogee to expect that by 
Subjection to be one of the primary human instincts. _ It might Do Ta ГЕ nation : for, at least in 
research on a group of, say, 50 people one could predict elegy oe а E КОШЕ Сат nof 
Physics, large aggregates of molecules show some consp by intensive observation 


| i body, j 
noticeabl . For example, I doubt whether anybody, ( 
SEG drop of inus. REI have discovered the peculiar law of eddy-diffusion which operates 1n the 


sea (Richardson and Stommel (27), 1948). 


| II: THE LINEAR THEORY OF ARMS-RACES 
| i 1 because submissiveness Was formulated by a 
| Ee theory, Qu first be mention’ periran е Russell (28) in 191 5 1 took, the і бе 
| that ныр is O called defence does not in fact defend, be 

77 


/ В 


The Submissiveness of Nations 


i time it provokes retaliation. Gregory Bateson (2), 1936, p. 266) 
Gua m UE EE New Guinea and generalized the concept as SIE 
schismogenesis,’ of which an arms-race is another example. Russell’s concept of шуша 
stimulation was formulated by me as а pair of simultaneous differential equations ( e 
revised 1935). Іп 1938 I hit on a way of testing the formule quantitatively in POR 
expenditure. The testing was continued in a monograph published a few months be! 


the war (Generalized Foreign Politics, Brit. J. Psychol. Mon. Sup., 1939 ; a better title would 
have been The Theory of Arms-races). 


rnings of an average sort of citizen, usually those of a 
The latest summary, including a glance at the present arms-race, is being 
The sceptical reader is referred to the aforesaid publications for the facts А 
here there is room only for the equations which summarize them. As we pass from the study о 
at of nations, the vagaries of individu: 
regularities of mass-behaviour. 
Let x and y denote the warfinpers: 
Constants of which «, 8, 


als merge into the quasi mechanical 


als of the opposing sides, г the time, and g, h, œ, B, k, I be 
k, lare positive. The hypothesis is that 


dx|dt = g — ax + ky, (2/1) 
dy|dt = h — By + Ix. (2/2) 
In particular, if g = 0, h=0, k = |, and a = B, then the solution is 
x+y = е-и, x у Be-(*an, (2/3, (2/4) 
where A and B are arbitrary co 


1 stants. Functions like x + y and x — Y, which are proportional to 
a single exponential of the time, are called * п 


ormal co-ordinates.’ They lead on well to modifications 
intended to represent submissiveness : that is why, in the subsequent diagrams, the axes are turned 
at 45° to those of x and y. 
When we are t rical facts, we are not given that а = В and that k = / 
exactly, although we ave reasons based on population, on industrialization, and on an assumption 
about common human nature, for believing that о/ 


cs by replacing the unknown hx + my by 
c immunity from error in a replacement of [ax — Moy 


У X X3 for they may even have ulty crops up in Sect, V below, where 


All the varieties of motion th: Қ Е 5), but 
without any restriction on the si quations like (2/1) and (2/2) 


, 1, have long been known (Lagrange, Weierstrass (31), 
the context 


: riab ial equation, an integral equation, O! 
a non-linear description, That was m objection t. Қа í : 
Proposed by Mr. M. R. Но) een eee 


IIT. FIRST HYPOTHESIS ABOUT SUBMISSIVENESS 
The statesmen of U.S.A., Britai 


ppear to pay no attention 

type of weapons, si 

t likely to provoke cou 

е relyin, as SS 

e no force, much submi son o some unexpressed p Бош n RE 
sene, inking proportionately.” "The perio! т оше SU simple. 

Iremember an impertinent Schoolboy who VERE S ring things is not aes by Various 

mild reproofs and punish; : ore rebellious E Mos 
oys, but who was finall Presence of a class o 

of nations seems to invo| 


much more severe Punishment. The behaviour 
78 


lve a similar reversal. 


L. F. RICHARDSON 


In 1939 Richards i -li i 
M Sven oe e à on (21) modified (2/1) and (2/2) by non-linear terms intended to represent 


аха = g — ax + ky {1 — oy — 9), (3/1) 
dyldt = h — By + Ix (1 — ex — »)}, (3/2) 
where c and р were positive constants representing the tendency to submit. 


. Application of the First Hypothesis (a) To Very Unequal Nations.—Eqn. (3/1) and (3/2) were 
originally intended to represent a large x-nation overawing a much smaller y-nation. If we leave 
out g and A, as not of the essence of the matter, and if we neglect y in x — y, these equations become 


dx|dt = — ax + ky (1 + ox}, (3/3) 
dyldt = —By + Ix (1 — px}. (3/4) 
It is dy/dt that will become negative. The smaller nation is suppressed. 


For example in 1948 claims by Guatemala to British Honduras were suppressed by a show of 
armed force by Britain, without any fighting (K9153, K11511). Iam not suggesting that the suppres- 
sion of a smaller nation is necessarily either just or unjust ; but am merely asserting that it is what 
usually happens in a world of power politics. 


(b) To Nations Equal as to their Constants.—In 1951 I noticed that the terms representing 
submissiveness would have remarkable consequences even if the two opponents were equal in their 
constants. How this might happen was explained in a letter to Nature (1951, ix, 29) headed 

Could an arms-race end without fighting ? ”, in which arguments about increase or decrease were 
applied only in circumstances not far removed from the actual. Wishing, however, to see a general 
integral formula over the whole field, and not finding any, I applied to Dr. M. L. Cartwright, F.R.S., 
Who has made a special study of non-linear differential equations ((4), 1949, (6), 1952). As ап 
apology for the lack of general integral formulz in the first two theories, I may quote her remark 


((5), 1952, p. 88) that “ what the pure mathematicians have done shows that the non-linear phenomena 


are genuinely complicated, and no easily applicable general theory can be expected." As the subject 
lainly intelligible meaning is to 


matter is economic or political, 3 per cent. accuracy coupled with ар! 
be preferred to infinite accuracy coupled with a complicated formula. 


_ A theory of the submissiveness of equal groups must not be too deterministic : it must not say 
Which of them will submit, for that is obviously a chancy affair. Such “ stochastic processes " have 
been the subject of a symposium at the Royal Statistical Society (Moyal, J. E., Bartlett, M. S., and 


Kendall, D. G. (16), 1949). 


By the time an arms-race has gone so far as to be alarmin; 
Seem relatively unimportant ; so that we may neglect g and A. 


ig, former grievances and ambitions 
On setting also « = B, k = Lo = р 


eqn. (3/1) and (3/2) simplify to : 
; , dx|dt = — ax + ky + Ка(ху -у?, (3/5) 
dyldt = kx — ау + Ко(ху — x°). (3/6) 
From a study of the submission of Germany in the 1920s I estimated that : rf * 


c »034 x 107 per person, 3/7) 
(Richardson (23), 1949 As this information is meagre, let us change to new variables which involve 
сіп such a RE 2 can discuss its general effects without knowing its numerical value. Thus 


defi; 
enne U, V by VIS. (3/8), (3/9) 


x = Ule, 
šo that U and V are pure numbers. Eqn. (3/5) and (3/6) then become 
аша = — aU + kV + KV — V?) (3/10) 
ау = kU — a V + KUV =U. (3/11) 
Further let 1= т, (3/12) 
So that т is a pure number. For thus 
du _ _% uv— VP, (3/13) 
a USE 
d (3/14) 


d U-| y+ UV — U*. 
ат 

it i umber. Before 1877 arms-races 
d namely, аль and 54 Deere such as would occur if alk 2 1. 


Owi 
ed in a stable balance ai f War it was concluded that k — 4 = 


79 


Only one constant remains explicit 
id not occur, and statesmen believ 


rom a study of the arms-race which led to the First 


BI 


The Submissiveness of Nations 


ак =}. (3/15) 


Arms-races between equal nations show most conspicuously in the sum of the warlike efforts 


of the two sides, whereas submission is essentially concerned with their difference. Let us therefore 
define S and D by 
S=U+V, D—y—U. (3/16), (3/17) 
By addition and subtraction of eqn. (3/13) and (3/14) it follows that 
eh") ny a ТС E 118), (3/19) 
dec s(1-3)- o» Ф. D(1+#)+ ds, (8), GI 


454%-18-р% арат — 3p +s. (3/20), (3/21) 
We can still keep track of submissiveness by remembering that it comes in by way of the quadratic 
terms, and would cancel in the linear term. 
These equations purport to represent a wii 
that initially the nations were considerably 


de range of possible behaviour. In particular, suppose 
D — 0 is a point of balance. Sufficiently cl 


disarmed, as in 1948, or more so. Тһе point S = 0, 
lose to it the quadratic terms are negligible, so that 
5 = const, ей", D= const, eê”, (3/22), (3/23) 
The exponential increase of S is the arms-race beginning inconspicuously, Meanwhile D tends to 
zero, so that the opposing sides 


y tend to become equal in their warlike efforts. But (3/21) may be 
written 
dlog |D] Ж B 2 
tig ess (3/24) 


As S grows, log| D] would at first become more negative ; but there would come a time T,, say, 
when S = 3/2, and immediately after that D would be unstable. That is how submissiveness would 
first begin to reveal its presence, 

Too much determinism can be avoided by supposing that D had become near to zero ; so that 
a chance disturbance could change its sign before or soon after it became unstable, After D? had 
grown sufficiently, +S — D? would become negative and 50 S would decrease. The decrease of 5 
would be recognized as the end of the arms-race. The above argument was in effect that of my 
letter to Nature (25) of September 29th, 1951: and I Still think it credible in the restricted part of the 
'S, D plane to which it refers. 

But now comes the difficulty, Dr, Cartwright, b sketching the inte rals over a wider field, 
emphasized a theoretical point of balance from which the fracas irali outwards. No 
corresponding historical facts have been found. ere is the further difficulty that if the representa- 
tive particle always comes out, how did it ever get in ? ere is a point of balance if simultaneously 


1 
0735-D, 0- ns (3/25), (3/26) 
The solution at S = 0, D = 0 was well known, It i EXCH) 
unstable * in-and-out ’ point, Other solutions are = а a 
3 1 
SE DEE 2У3 = + 0:866, (3/27), (3/28) 
That is either (U, V) or (V, U) = 1-183, 0:317. 


The general features of the theoretical field ar 
method used by B. van der Pol in solvi i 


ing h i i i led 
Which 4045 is constant, wan өшіп (18), 1926)..The lines, calle 


f a conti : i B re 
uncertainty. However, van der Pol ((18), 1926) used twin Ree this process involved то! 
80 


— ай 
шы. nu a فت ن ى‎ — 


L. F. RICHARDSON 


185 2 


field of two opposing sides that аге equal in their 
d D to the difference of their warfinpersals. The 


Hua curves аге the isoclines along сас h dD t hey 
ID/dS = 0, 4, 1, $, © — 1, —3. The directions of motion are indicated by short 
The thick lines are trajectories 


heh 

segments crossi isocline. 

directions. ossing each isocline. 
Toi i s 1 : : 2 ME) 

defi © investigate in more detail the neighbourhood of the point of balance at 5 = 3, р= === 
efine e and Z by 2 

(3/29), (3/30) 


IF 
E و ا و‎ 
and substitute the expressions in the eqn. (3/20) and (3/21) obtaining 
5 ыда Va oet 5 
de _ 1 пе Eden ar 27 = ع ود‎ +e. (3/31), (3/32) 


ат 
The term i 24 02 = 1 thet f the second degree 
s of zero degree have cancelled. On the circle e + бз = 1 the terms of the second ceg 

aes the same order of Heras as those of the first degree man s d зау, iade the 
b OP ipd 2 f the second degree, ecome aimos ULC. T 

gf Tadius 1/10, to лды of which the following approximation 15 restricted, is quite a small part 

ig. 1. In the standard manner try the assumption that, where A, B are constants, 

(3/33), (3/34) 


== Aer, C= Bev, 
then pote vee; (3/35) 
(5/36) 


and BR DI 
81 


The Submissiveness of Nations 


These simultaneous equations for A and B are consistent if the determinant 
1 


ie - 3/3 
2 ee =o (3/37) 

23 -— 

2 % E 

1 3 
that is, if х – 3^ +3 =0, (3/38) 
ts of which аге 

the roots of whicl nik OE өз») 


3 s 8 twards. 
Because these are complex with positive real part therefore the particle moves spirally ош 

The period of e or 5 is 277/1-199 on the scale of T. But t = v/k ; and k has been found from the 
data of 1949, 1950, 1951 very doubtfully to be 4: years-!, Therefore the period is : 


TE years = 1*, years. (3/40) 
For a political double * swing of the pendulum that seems incredibly rapid. Maybe the warfin- 
persals for 1952 and 1953, when available, will sho: 
for making an accurate drawing of the trajectorie: 
worked out in the author's Arms and Insecuri 


Complex, the trajectory can be obtained from an auxiliary equiangular spiral by stretching or squeezing 
it at right angles to a characteristic direction E. 


gular spiral revolves counter-clockwise and that its 
length increases in the ratio 3-707 рег turn, or in the ratio 1-388 per quarter turn. The Characieri 15 
direction is at E = 105° from the c-axis. The trajectory is obtained by stretching the equiangula 
spiral in the ratio 1:5005. 


Dr. Chike Obi (17), of University College, 
Dr. Cartwright, has made an extensive study, fro; 


Ibadan, Nigeria, who was formerly a pupil of 
equation 


m the point of view of pure mathematics, of the 


< Ў (H+ Bx) X + x + vx? = 0, (3/41) 
where о, 9, y are constants not too far removed from zero, It is possible by a chain of transformations 
to connect the present hypothesis exactly to Obi's form. For this purpose we first introduce a nm 
variable о = D*, then shift the origin to the vortex point, and finally eliminate that deviation whic 
is not in the SS direction, 

It is sometimes asserted that * hi 
there are only spirals in Fig. 1. 


The origin is what I have elsewhere (1949) named an ‘ in-and-out’ point. Lefschetz ((12), 1948) 
names it a ‘ col.” Cartwright calls it a “saddle point? or ‘col.’ To the right of it the drift is towards 
Tearmament ; to the left of it the drift is towards negative armament, which I formerly suggeste 
(21), 1939, р. 7) was another name for co-operation. The practical difficulty was : how could the 
nations ever get across to th 


1 € co-operative side, seeing that the flow diagram does not permit a direct 
crossing ? Fig. 1, however, shows that by going round over the 


ion TR А d бор ор the vortex there is a regulen 
gion. ese are strange ideas. They ај times 
there was no fighting. 8 y all purport to refer to 
There is a boundary-spira| 


я ‘in-and-out’ point. Апу slightly lesser 
spiral passes first through the г Any slightly greater spiral ла beyond the “ in-and-out 
Co-operative region, 


Story repeats itself.’ If 50, we should see a closed orbit. Actually 


: . It was only by deduci 
incredible entailments were noti y'5y ucing 


iced. 
Let us therefore Pass on to a different hypothesis. 


; 


%-. 


22-4 


L. Е. RICHARDSON 


overwhelmingness of the threat depends not on y absolutely, but on y relative to x." The 
reconsideration is that it is not relative to the x present at that time, but to the x that could 
be mobilized. This * x that could be mobilized ° is almost a constant for any one nation, 
but varies very much from one nation to another. 
Thus let the new hypothesis be 

dx|dt = g — ax + ky{1 — yim}, (4/1) 

dyldt = h — By + Ix {1 — xin), (4/2) 
where т and n are constants for those nations, and are such that т is proportional to the total 
warfinpersal that the x-nation could mobilize, and л is proportional to the corresponding quantity for 


the y-nation. Therefore 
x<m yn. * (4/3), (4/4) 


As before, g and h will be neglected, because they are not essential to the phenomena of sub- 
missiveness. 

Consequences of the Second Hypothesis (a) For Very Unequal Nations.—Suppose that m > n, then 
піт, and a fortiori y/m, is near zero : so that 1 — yim is almost unity. There is, on the contrary, 
no reason why x/n should not much exceed unity. Tn these circumstances we have approximately 


fi 4/1) and (4/2; 
sius el dx|dt = - ax + ky, (4/5) 


dyldt = — By — Ix*in. (4/6) 4 
Therefore у is decreasing. The smaller nation is suppressed. For this case of very unequal nations 
there is nothing much to choose between the two hypotheses. Either of them describes what usually 


happens in a world of power politics. 
(b) For Nations Equal in their Constants.—From (4/1) and (4/2) we have 


dx|dt = — ax + ky (1 — yim), (417) 
dyldt = — ay + kx (1 — xim). (4/8) 
The constant т can be made to disappear by a change of variables. Thus define и, v by 
х = ит, у= ут; (4/9), (4/10) 
so that и <i, y & 1. (4/11), (4/12) 
For then аша = — «u + kv — kv, (4/13) 
аа = — ay + ku — Ки?. (4/14) 
The two remaining constants, « and k, can be reduced to one, in the following way : 
As in the former theory, define T by ЕЕ ans) 
so that 
du | _% m= 90 (4/16) 
BEE 2 LEW У 
ау” o E (4/17) 
dz Е? Eu—u 


Finally, define S and D by (4/18), (4/19) 


Obtainin, 
g dS _ (1-2) 5-30 Dh (4/20) 
т 
aD _ — (145) D+ 80. (121) 
ат 2 
Or, with «/k = 4 as in the former theory, to represent modern conditions 
aL ADS (4/22) 
asid 5$ — 3S + D 
(4/23) 


3 
Did рар, 
V but that for 45147 has 


thesis ; n 
e equation for dD/d^ is ENDS sme É Du um Do EE dSjdt and dD|d* vanish 
3(S* + D?) in place of simply 2 
simultaneously are given by T SEES E (s “3 3) ; (4/24), (4/25) 

о Seas 2 


83 


The Submissiveness of Nations 


If from (4/25) we accept D = 0 and substitute іп (4/24) we get 0 = S — S? which has solutions 
S=0orS=1. Alternatively, if 5 = 3/2, the second condition is satisfied and the first becomes 
D* = — 3/4, so that D is imaginary. There are therefore only two points of balance, namely, 


5-0,0-0 and S=1,D=0. (4/26), (4/27) 
We seem well rid of the puzzlingly unsymmetrical point of balance of the former theory. The 
an in-and-out point. 


point S = 0, D = 0 is of course of the same character as before, namely, 


(85 


O5 


2525 © OS 


ould move. 
The general field is shown in Fig, 2 by way of isoclines and trajectories. The isocline 

DN. 

as ^J, (4/28) 
where / is any constant, works out to 

S—Si—p:_D 
D 705—3); (4/29) 

50 that changing the sign of a 


9 while S increases, we may neglect D in 


(4/30) 
84 


L. F. RICHARDSON 


in which the variables are separable. Its integra Jcan be verified to be 
1 
SES LF Cen, (4/31) 


where С is an arbitrary constant. 5 T i 
W [ n аг) : oasT-0,S--]l. Let us investigate the nei; 
е point S = 1, D = 0 on the second hypothesis. Define = here differently aoe UNES 


and thus 
S=1+6. (4/32) 


Insert this in (4/22), (4/23). Then 


т 1 Ira 
4147 = — 3e =7 (є? + D’), (4/33) 


арх = =1р+ єр. (4/34) 


If we neglect the quadratic terms іп = and D, the solutions аге 
== Де", Р = Be, (4/35), (4/36) 


where A, В аге arbitrary constants. This point is therefore one of thoroughly stable balance. 


The theoretical state of affairs at which we have arrived as a deduction from the secon: 

theor d 
пуро might please some people : large armed forces, stable balance, and therefore 
пс He ting. One of the editors of the Cambridge Modern History (3), Stanley Leathes, wrote 
ah is preface to Vol. XIJ : “ On the whole, the existence of this tremendous military equip- 
nt makes for peace. The consequences of war would be felt in every household ; and 


Statesmen, as well as nations, shrink from the thought of a conflict so immense." This 
World War broke out 


pas published in 1910. Unfortunately for Leathes’ theory, the First 

АҚЫТ years later. Yet this theory persists. Winston Churchill, speaking at the Pilgrims’ 

inner on October 14th, 1952, said that “ a third world war is unlikely because both sides 
dreamt of before.” 


know that it would begin with horrors never 
So my second hypothesis about submissiveness appears to agree with Leathes and with 
he outbreak of the First World War as evidence against it. 


Churchill, but to have t 
ersonally I have no confidence in it, regarding it as wishful thinking. 


THESIS (FOR SIDES EQUAL IN THEIR 


CONSTANTS) 


When formulating the two former hypotheses, 1 had hoped to have some insight into 
what the motives might be. Now I ignore motives, and try to make a mathematical descrip- 
tion of the courses in time of the opposing warfinpersals before, during, and after the First 


Rs War. The facts must first be reviewed. d the Triple Alli h 
i) From A.D. 1907 to 1914, the two sides being the Triple Entente and the Triple A ance. ine 
. > h м Қ ; 4 proportional to its distance from 
ht line with a spere P DOR is given by Richardson 


V. THIRD HYPO 


AE ponr was moving nearly in a straigl lewin mE 
int of balance” at x = £135 million, y = million. 

ing has been converted to warfinpersal by 
gia ae ion is very rough for the other 


nations. Before these numbers were plotted in Fig. 3, Y 
(ii) There followed the First World War. Italy changed sides. 
llion equivalent persons in 1917 
Is of the o! 


Succeeded in directly ascertaining the warfinpersa 
5 irs, 1938, р. 
Lloyd George ((13). War Memoirs, Toobin troops). From 


ludin, Colonial an 
Маф те combatants. Now 12:2 (296 = 4:13. 
It is understandable that the warfinper: erably exceed the combatant stron h, ma 
the warfinpersal accounts for the non-combatants and the munition ma ЕА ARE Georges 
(р. 1,577) further states that at the end of 1917 the total combatant SoA for the OT BOTE 
including any Russians or Roumanians, was 5-4 million against 5 2 milli 1 for do the other nations 
y rough estimate is made by assuming that the British factos 413, SPP Central Powers EON, 


DOO so that the maximum warfinpersals were : 22 
"8 x 105 equivalent persons. L3 È wer eat 
i i t to enquire when t hint 
freer тысы y ately sube is rele The question is not about individuals or 
85 


Composition of the British Army in 


this statement I compute that there Were = ° 
sal might cons! 


Missiveness appeared in t 


The Submissiveness of Nations 


i i i le in 
er place ((22, 1948) I have given a theory of war-moods in which the peop 

‘Gonna ane рае Cz by five variables ; but here only one variable is ETS Од 
December 12th, 1916, ће German Chancellor suggested a peace conference. The opposing (lo d 
after consultation, said they “ refuse to consider a proposal which is empty and aE oi or Т 
George, р. 661.) There were various other tentative and secret peace feelers іп 1917 ana m Up 
mostly Austrian. According to a Reichstag Committee, quoted by Lloyd George (p. ^ DE 
to 15th July, 1918, the Supreme Army Command rejected the view that victory was Toi o E 
possible of attainment by force of arms, and gave no support to peace negotiations upon zh pans 
of a military stalemate ...” During the next three weeks the fresh Americans came to the aid о! T 
tired French and British. A battle on August 8th, 1918, in which the Germans had no reserves n 2 
changed Ludendorff’s opinion, and on August 14th the Kaiser, presiding over a Crown Council 


Spa, admitted that Germany would have to find a suitable moment in which to come to an under- 
standing with the enemy. 


(ii) After the First World War, the sides were not quite the same. There was the RUD 
revolution and its consequences. U.S.A. withdrew into isolation. A new diagram was draw 
showing the warfinpersal of Germany plotted against that of its entourage, consisting of Enn 
Britain, Poland, Czechoslovakia, and Russia. This appears in a microfilm (Richardson (23), 194 
and is modified here into part of Fig. 3. It shows motion, along a nearly straight. track and slowing 
down as a point of balance is approached. There was a delay of eleven years in the neighbour. 
hood of this point, and then a departure, with. increasing speed along another almost straight track. 
This was the second great arms-race beginning. 

(iv) It led to the Second World War during which the warfinpersals approached the state of 


total mobilization, but were difficult to ascertain. The sides were rather different from those of 
the first war, Japan and Ital 


ly in particular having changed over previously, and Italy chaosie 
back again. In October, 1939, Hitler made a peace offer: but Britain and France did not regara 
it as at all submissive. After that the demand for the unconditional surrender of Germany delaye 
any sign of submissiveness until defeat. - 
(v) The Occupation Statute for Germany, an expression of submission, has now been in operation 
for seven years. Disarmament after the Se 


1 1 cond World War was barely over when enmity between 
U.S.A. and the Soviet Union began to dominate the scene. 


, Fitting Formule to the Facts.—It seems permissible to ignore the changes in the member- 
ship of the sides, so that x and y may be regarded as having the same meaning throughout. 
This has already been assumed in Fig. 3. 


10 


x+y 50 
Fic. 3.—Pro- and anti-German warfinpersals in Euro implif i i ther 
si В сатылымы; Я ре, 1907 to 1932, simplified by ignoring otl 
canes in the sides ; and in particular by ignoring the brief, but Galea te, intervention 
E m oidin are the sum and difference, The numbers set beside the plotted 
lates, 


ike} 


THEORY 
У-с 


O =e 


x+y 50 


he 

f æ (6/17), (6/19), and (6/20) of t 

Numbers set beside the plet and after the First World War. АМ with Fig. 3. 
ee to tite - A unit of опе year makes Fig. 4 agre? 


86 


L. Е. RICHARDSON 


As the mathematical description is to be of that which usuall; 
етай ly happened, we а 
to beg the question Could an arms-race end without fighting ? " ыны US 
ue gidas of two pnt of thee trials, that it usually does not. This assumption is in 
ong contrast with the first an second hypotheses, which leave that importa: i 
open at the outset. ; XE DeC 
.. Remembering that an arms-race has to do mainly with the sum x + y and missi i 
with the difference y — x of the warfinpersals, let 5 and 7 be defined iios 2 Ser Oa 
Ер UVES (6/1), (6/2) 


so that conversely 


x= 16 Sh = i (6 + 2). (6/3), (6/4) 


Accordingly & and 1 аге respectively constant multiples of the S and D of former theories. We сап 
now describe the changes in & without mentioning 7, thus : 
© After wars there have been disarmaments. Therefore dE|dt must be capable of either sign. 
(ii) For small £ it is observed that d&/dt varied as + (E — E where & is a constant, called a 


point of balance. 

(iii) There was an upper bound to & when both the opposing sides were fully mobilized for. 
маг. Let y denote this upper bound. 

The above three requirements are satisfied by the assumption that 


(EY = aXe — ED, - 9. (65) 


nd; 7 does not appear; 2 


where a? is some constant. А simpler expression could not be fou 
E is near zero then 


cannot exceed y, for if it did then 4/4 would become imaginary ; when 


dé |dt = + a (E — 5) Ут. 
The variables are separable, so that (6/5) can be integrated. The result, which can be verified 


by differentiation, is 
E — E = (y — E) sech? Bt, (6/6) 
where B= ауа 80. (6/7) 
Att =0, E =v; that is to say, the constant of integration has been chosen so that 1 = 0 at the 
Ast> — ©, > Еа: there is not, on this theory, 


maximum of £, which occurred during the war. 
any sudden or definite beginning of the increase of warfinpersal. This vaguely agrees with the 
author's general impression of a collection of deadly quarrels that ended since A.D. 1820 (Richardson 
(24), 1950). The question “ Who began it?" seldom had any non-controversial answer, He 
abandoned any attempt in that collection to name the aggressors. 

To determine B from observation, let 7” be the time at which 5 was halfway between Y and &. 


It follows from (6/6) that 
p 9885, (6/8) 
Fig. 4, has been drawn with 
B = 0:5 year". (6/9) 
Next as to 7, the facts represented in Fig. 3 suggest that 7 did not alter much while Ё was 
in 7 were conspicuous. These were of two 


increasing, but that, after the maximum of Ё, changes 1 ) 
kinds : at first, while E was large, || increased ; later when & was near & there was а settling of 


7, to about the same values, so that finally 


The theoretical diagram, 


(6/10) 


Nie 


6/11) 


is expresses the final submi the x-side. It would have been a mistake to have 
ҮН Wend to zero, unless & also did. To express both kinds of change in 71 we need something 
like a disjunction or ** either-or ” formula. cted on the model of & + 1/5 which 
behaves like E when & is large, but like 1/ when & is near zero. When Е was large, 7j departed further 
from zero without changing sign, i nal to 7. When Ё was small, 7. tended 
to ča, as if dn/dt were proportional to & — 1. ese diverse tendencies could be combined by 
weighting them respectively by Ё and 1/2. That, however, would make the two terms of different 

87 


The Submissiveness of Nations 


dimensions, for Ё is a number of equivalent persons. So also is Y- It is found to be satisfactory 


to use the dimensionless weights 22 /ү and Y/2E. Thus one arrives at the assumption that 
2 25 n(x )} 6/12) 
== |12) + 6-2 (E)! ( 


in which o; is some constant measure of submissiveness, and is the reciprocal of a time. 


is required to be not quite so sudden, four approximations to H(t) are available (Van der Pol and 
Bremmer (19), 1950, pp. 56-7). I shall write 


dn _ LT EY 6/13 
A а 7 HO ova oi. (ӘУ) 
So that dn/dt = 0 for 1 < 0, 


The inevitable non-linearity has been introduced, but in a manner quite different from that of 
hypotheses I and II. E. 5 

As a discontinuity at f = 0 in dnldt has been admitted, a discontinuity also in «22/4 EY 
not seem too fanciful. It allows us to joint together at { = 0 two formule for &, both like ( 2 
but the later one having b for a and С for В, so that the final point of balance E» need not agre 
with the initial E, Observed constants, in million equivalent persons, were about 


Y= 44, Des 13; (6/14), (6/15) 


The integration of (6/13) is, however, much simplified if we take čs = 0, for then 7 is a factor of the 


second member, so that for t >0 


410612] CEY 


= b 6/16 
4 YE TOES (55) 
This simplified Problem is solved as a Preliminary. Substitute, in view of (6/6) and (6/9), 
2 sech? $ 5 (6/17) 
obtaining from (6/16) 
1 | 
dlog ln] =E D sech? i = 3 cosh? $|. (6/18) 
The integral is 
E За 5 2155) 
m Exp |б, 14 tanh 5 q sinh / 21 |} (6/19) 


which can be verified by differentiation, The measure c; of submissiveness was, of course, unknown ; 
but Fig. 4 was plotted with 


ы (6/20) 

and comparison of Fig. 4 with Fig. 3 shows that Ss = 1 is of the right order of magnitude. The 
complete eqn. (6/13), with & not zero, has for an integrating factor the expression (6/19) (Forsyth (7), 
1929, р. 20) and is thus reducible to quadratures. But these look difficult and are perhaps not worth 
Pursuing at this Stage, 


MATES CONCLUSION 


The memory of recent overwhelming military defeat, зау, within five years, certainly 
leaves a tendency to submit. A large state has been known to overawe a very much smaller 
State, without any fighting. Apart from those two types, I have looked for submissive 
effects, but have Not noticed issi 


88 


L. F. RICHARDSON 


against A is proportional to the excess of B’s armed force over A’s at the same time, or 
(ii) that it is not A’s armed force at the same time that matters, but what A could mobilize. 
Not all the consequences of these hypotheses are immediately obvious. In historically 
known circumstances the hypotheses seemed plausible. It was only by deducing their 
consequences in unusual circumstances that incredible entailments were noticed. This is a 
good example of the usefulness of mathematics in social science. > 

Most governments seem to expect that a show of armed force, to a rested and equal 
group of nations, will have an effect of the same sign as an overwhelming military defeat. 
This expectation appears to be a mistake : the actual effects go opposite ways. 


. The Quakers, Gandhi, and others have tried to persuade everybody to be as non- 
violent as the exceptional Christian saints. Most people feel the suggestion to be too utterly 
contrary to their defensive impulses. Could not a middle way be found, with the motto 

defence without provocation.” This motto might well be applied to words as well as 
deeds. The existence of the various arms of nation A presumably have different psycho- 
logical effects in peacetime on nation В; for example, one may guess that B is much more 
provoked to retaliate by the existence of A’s long-range bombers than by A’s land-based 
fighters. Jonathan Griffin (8) discussed ‘ defence without menace “іп 1936, and now again 
the problem of the classification of armaments into provocative and non-provocative requires 


much more attention than it receives. 

Submission, except in the sense of defeat, is not a spectacular subject for newspaper 
headlines; it is not like fighting for freedom. Yet we need to remember that in any 
organization, whether it be a social club, a scientific society, a business, a national parlia- 
ment, or the United Nations, a suitable amount and distribution of submissiveness is essential 
for smooth working. If each member struggled all the time for his own advantage there 


would be chaos. 

The present study has perhaps over-emphasized the bitter submission of the defeated, 
threatened, suppressed, or snubbed, and has neglected the joyous devotion to an admired 
leader. To correct the balance let me instance a classical story illustrating the latter tendency, 
that of Naomi and Ruth. It is described as an affair of tribal loyalties. 


In the United Nations, activities of the follow-my-leader type are similarly noticeable 
on both sides of the major schism. 


REFERENCES 


. Allport, G. W. (1945). Personality. New York : Henry Holt. 

. Bateson, G. (1936). PX CU a Dn PSY Pes 

: ridge i », XU . Cambridge: Un : ¢ 

۴ аи К d е оп AREE MEE The A S of Science, 6. 

ight, M. L. E -Linear Vibrations. Math. azette, 36, 81-8. 

Ч Cartwright М: Ты Ve Ва oscillations in non-linear systems. (Lectures at Professor 
те Н 5 і ^ 5 B ^ С n 

-Е ааа R. ( 1929). D Treatise on Differential Equations. голоод. Macmillan. 

. Griffin, J. (1936). Alternative to Rearmament. oath. Macmillan. 

. Horne, M. R. (1951). Letter in Nature, ovember 24th. . cite T 

19: Keesing’s ER A pio earn pubiica nons : cited as ‘К. 

1. L li. Math. Biophysics, 13, 1-19. pi а Ж. 

12. Duc E on Differential Еапайопга FOE оте University Press. 

13. Lloyd George, D. (1938). War Memoirs. London: - маа 

14. Lorenz, К. Z. (1952). King Solomon's Ring. Ton gor E Ито. 

15. McDougall, W. (1917), Spe PL D. G. (1949). “ Symposium on Stochastic Processes." 


16. Moyal, J. E., Bartlett, M. S., and Ken 
J. Roy. Stat. Soc, B., I, 150-282. P ases sb. 
17. Obi, C. (1950). (Unpublished.) Thesi: or rh: n 
18; Pol, B. van der (1926). Phil. Mas., Il, 978-92. 
89 


Lon Anpon— 


The Submissiveness of Nations 


19. Pol, В. уап der, and Bremmer, H. (1950). Operational Calculus based on the Two-sided Laplace 
Integral. Cambridge : University Press. 


The Indian Journal of Statistics, 12, 205-28. 
27. Richardson, L. F., and Stommel, H. (1948). Journ. Meteor., 5, 238-40, А 
28. Russell, Bertrand (1914). War the Offspring of Fear. London : Union of Democratic Control. 
29. Volterra, V. (1930). Theory of. Functionals. London : Blackie. В 
30. Jeffreys, Н. and B. S. (1946). Methods of Mathematical Physics. Cambridge: University Press. 
31. Weierstrass, K. (1875). K. Akad. der Wiss. (October 28th). Also Werke, Il, 75-6. 


90 


s 


ج 


ж 


` Vol. VI The British Journal of Statistical Psychology November; 


Part П 1953 


A PRELIMINARY STUDY OF THE 
THEMATIC APPERCEPTION TEST 


By AMYA SEN 
University College, London 


I. History of Method. П. Subjects. ІП. Test Material and Mode of Scoring. 
IV. Reliability and Validity of Assessments. V. Summary and Conclusions. 


I. HISTORY OF METHOD 


In what is known as the ‘thematic apperception test’ a series of somewhat 
vague and problematic situations are presented to the subject, and he is asked to 
make up a brief story explaining the situation, and indicating the feelings and thoughts 
of the principal characters. In the more usual form the material consists of standard 
pictures ; in other forms the situations are described verbally. The psychologist 
then attempts to infer the personal characteristics of the individual tested from the 


nature of the * themes ' appearing in his stories. — Д j 
The main object of the investigation described in the following pages was (i) to 


examine the possibility of scoring the pictorial form of the test in accordance with a 
more adequate theory of personality, and in particular (ii) to estimate the validity of 


assessments so obtained for certain basic traits. 
In order to decide which traits are likely to lend themselves to assessment by such 


a test, it is essential to begin by asking what kind of mental activities such tests tend, 
in general, to provoke. This preliminary study has been strangely neglected by 


nearly all who һауе devised or used such procedures.* 
The i i ial for tests of character and personality has a far longer history than 
is commonly ERU peri ago as 1886, Dr. Sophie Bryant,’ the first hen Я Шал 
London High School and one of Francis Galton's most active disciples, cae ok A RET 
mye been the earliest set of experiments in taung rane ee cipit Eun Es e es pn 1 
ords, and i vers chiefly in ter 5 1 і 
then crest te Binet pepe of individual differences, got children to gaira both pices 
and objects, and distinguished four main types—the observer, the describes ihe ere and a 
emotional ((3) and other papers); later he included pictures In his Ex 21 Ота pe ae 
lost of the older manuals of experimental psychology include experiments ESSA E 
diagrams, or ink blots, designed to illustrate the principles of apperception ; We E e 
the importance of personal and emotional factors,“ as well as tendencies по! 
described as * primary and secondary laws of association. 


1Б indebted to. i 
or much of what follows I am inde! British work on personality 


* Curiously enough, Vernon, in his survey of 
the various papers published by Sophie Bryant. There C Ce IDE 
account of * tests of description and report (2% pp. 290, 292, 315, 
E.g., Witmer, L., Analytical Psychology : A Practica 
J dts lytical Psycholog. 
C. E, Elementary Experiments іп Psychology (1908), ch. Xll... developed by British in contrast 
4“ Apperception;" says Stout (and this is the special interpretation the process by which a conation 
(о continental writers), " is essentially a сөлін PHA, d 51): it is instructive to note how Stout's 
fulfils itself ^ ( ‘Analytic Psychology, 11, П Oe LER 15-62) anticipates the Freudian doctrine 
of contia pe te wp percep sf unconscious tendencies, and indicates their importance 
Or the organization of personality. поя ЕНЕ Ше со oap 


* Cf. Stout, op. cit., pp. 132f., on the 
қ 91 


perceptive systems." 


A Preliminary Study of the Thematic Apperception Test 


It was a growing realization of the defects of the older association theories that led British writers 
to develop the idea of a more general integrative principle : in doing so, their criticisms of associa- 
tionism and their emphasis on organization rather than mere linkage largely anticipated the views 
of the Gestalt psychologists, particularly in the field of personality. The introduction of the term and 
idea of ‘ apperception ” into modern psychology is due chiefly to the Herbartian school, particularly 
in this country to Stout (2) and Adams (5). It may perhaps best be defined as “ the process whereby 
the content of a particular sensory presentation, at first relatively indeterminate and meaningless, 
seems to become clearer and more determinate, and so acquires for the perceiver some kind of apparent 
meaning, in virtue of a fusion or synthesis with contents already systematically organized in the 
perceiver’s mind." АП Herbart’s followers insist that a man’s mode of apperceiving an object 
depends largely on his own individual * apperception-systems,’ and hence may incidentally provide 
a clue to his intelligence, interests, and character. Adams relates the legend of the Indian princess, 
who embroidered on her banner representations of herself, a coin of the realm, and her father’s 
imperial crown, all superposed to make a single picture-puzzle ; each of her suitors was required 
to say what the device portrayed : the lover who saw, not the money or the crown, but the portrait 
won the contest. Steinthal, one of Herbart’s best-known exponents, quotes a similar story: “ we 
place а man,” he concludes, “ largely by the point of view from which he regards things, in short, 


by his mode of apperceiving."* 

Binet, Stern, and most of the earlier investigators who used pictorial tests were 
interested chiefly in the examinee's ability to observe what is before him and to 
organize what he observes into an intelligent report. The first to exploit the principle 
of apperception for testing the dynamic aspects of personality appears to have been 
Burt. In describing his tests of * emotional activities" he writes: '* Ambiguous 
pictures, that is, pictures which can be apperceived or attended to in various incom- 
patible ways, are exhibited to the different individuals ; . . . and the particular mode 
of apperception shows consistent and significant differences." Some of the most 
conspicuous peculiarities were those revealing the distinctive interests of the two 
sexes (7). He observes that hitherto most writers had accepted Whipple's definition 
of a “mental test” as denoting * ће experimental determination of some phase of 

' mental capacity’ (©), p. 1), thus restricting it to cognitive processes ; the attempt 
to measure the affective and conative processes, or, as he sometimes terms them, 
the * dynamic or orectic aspects * of personality, has been ** almost entirely neglected." 
He reports the results of various experimental procedures, and concludes that “ the 


best tests of emotional disposition are to be obtained by the method of impression 


rather than of expression," i.e., by their influence on such processes as attention, 


association, preference, and apperception ((7), p. 381 ; cf. (9)). 


Since the methods of scoring and interpretation adopted in the present investigati і 
were based on those worked out by Burt in his work with children, it may be Helpful er 
quite briefly his general procedure and how the results were classified and marked 
In his re-standardization of the Binet intelligence-tests he notes how Binet's picture of a 
mysterious figure standing on a couch and looking out of a window elicited striking individual 
differences, quite apart [rom variations in Observation or intelligence: “some children describe 
1 man шерте ос i CM oe he Appears in the EG : others subjectively identify themselves 
I г г at he might be supposed to see—basing their descripti largel 
their own wishes, experiences, or fears." He accordingly retai dem is Sena eee Led 
pictures.’ The others’ used in these-earlier researches Were 5 s po Een E poss 
Dawn, Boecklin 5 Toteninsel, Millais's Boyhood of Raleigh, Yeames When Did You Fas Si EDU" 
Father ?, Wallis's Death of Chatterton, Newcome's What Shall He Be When Grown U, A Dali’s 
раа Mo ation from tne Sira Magazine entitled She Was Looking at Me РИ the 
1 wo drawings from Punch. і 
series because they were readily accesible to othe Tie wane the te b 11 t Spica 
ments he found that better results could be obtained by using specially CENTS ERE аа 


92 


ж 


AMYA SEN 


by previous work and drawn by Miss Charleston ((16), p. 113). He describes the whole procedure 
as “ resembling what older writers would have regarded as a ' free association ' test, with concrete 
scenes for stimuli instead of words”: “the visible form," he explains, “ appeals more directly to 
primitive emotions and less to the higher critical feelings . . . whereas words lead more rapidly to 
emotional repression ” ((10), pp. 326, 330). 


The method of administration resembled that prescribed by Whipple for tests of description and 
report ((6), pp. 287, 301). In the group-form ‘of the tests each child had a copy of the picture in front 
of him, and was asked first to write a free composition (‘story’) and then to answer an “ interro- 
Batory. Titles were erased, and had to be supplied by the child. With younger children a roneo'd 
outline of a story was presented to be filled up like a completion test. In the individual form of the 
tests the story and the replies were given orally. The schedule for marking the results, however, 
instead of referring solely to * accuracy and range of report ' (as in Whipple's version), included a set 
of headings for classifying and assessing replies, based on much the same scheme of personality as 
was used for other tests or assessments, noting (as had been done for the verbal association tests) 
the common types of clue to be found in the content of the responses and the apparent significance 
of each (21). The selection of clues was determined by a systematic item-analysis. 


The quantitative results were only moderately successful: (average figures are, for reliability, 
0:57, for validity, 0:16 to 0-38 : (16), p. 116). But the method proved to be, like the association 
test, a useful starting point which could be followed up in private interviews on quasi-psychoanalytic 
lines. It was therefore, as he explains, regularly used by himself and his co-workers in studies of 
delinquent and neurotic children.* 7 
. With the revival of associationism, owing to the influence of the behaviourist school,* interest 
in such concepts and procedures seems to have suffered an eclipse. There is no mention of apper- 
ception tests among the ‘ projection tests ’ described in the first edition of Cattell’s Guide to Mental 
Testing, nor yet in Earl’s excellent review of * methods of assessing temperament and personality.’ ® 
However, since the war the work of Murray and his colleagues at Harvard has led to widespread 
revival of interest in this type of test. To begin with, they apparently envisaged the method as a 
means merely for ‘ investigating fantasies ’ (11), but later realized it could be used for more general 
Purposes (12). 

The type of material and method of interpretation developed by these writers is excellently 
described by Vernon (22), who also gives a critical and impartial summary of the chief results. 
“Binet, Burt, and others," he writes, “ have used picture interpretation as a means of studying 
intellectual and emotional qualities, but the test in its present form was first described by Morgan 
and Murray : they collected a standard series of 30 photographs ; the subject is shown each picture, 
and asked to make up a story describing the situation, the events leading up to it, and the outcome, 
together with the thoughts and feelings of the characters. .. . Unfortunately the interpretation varies 
widely with the theoretical background of the psychologist. Those who follow Murray have adopted 
his elaborate system of needs and presses (external forces that the individual regards as beneficial or 
harmful) : Rapaport and others present simpler schemes. They are based on the style or structure 
—compliance with instructions, language characteristics, logical coherence, consistency of stories 
with one another, realisticness, etc., and on the content of recurrent themes. . . . Burt and Sen (іп 
an unpublished memorandum) advocate a more straightforward approach, owing little or nothing 
to abnormal psychology. 

“ Many writers have testified to the value of T.A.T. in mental hospitals, clinics, etc., especially 
when used with other tests such as Rorschach. But scientific evidence is conflicting. . . . Harrison 
was able to assess 1.0.5 with a validity of 78. Sen finds the reliability of assessments by different 
Scorers for Burt’s qualities to average only *4 ; nevertheless such qualities as Observation, Verbal, 
Level of Organization, and Maturity gave very promising correlations with follow-up results among 


high-grade civil servants ” ((22), pp. 181-4). 


3 Cf. (9), (10), and The Young Delinquent (1925), pp. 40 f b * apperception test,’ turns t th 
Even to-day, i ious to know what is meant by ап “ар EBE 
index of qe Yext-Book of psychology, he will find the word neither in Woodworth nor in Boring, 


Lai i nce of the term in British literature is due largely 
nafeld, and Weld. The temporary disappear n and assimilation," he maintains that his 


to Spearman's i erring to “ apperceptio ^ TERRA: 
doctrine of the пое. с Eon AH correlates (really a revised form of associationism) shows 
that neither term can supply anything that psychology needs " (The Nature of Intelligence and the 
Principles of Cognition, 1921, p. 26). ; 
3 Earl awards high түкір to the inkblot test (as popularized by Rorschach endl нік followers) —a 
test which older writers showed was so largely dependent on appercepiion ds above defined (14). 
Yet the intuitive methods of scoring the replies, as he describes, could only be accepted as plausible 
by one who was entirely unfamiliar with the theoretical and experimental work on apperception. 
93 


G 


A Preliminary Study of the Thematic Apperception Test 


Several correspondents have since asked if fuller details could be made available in regard 
to both the method of approach here referred to and the results obtained. Accordingly, I have 
endeavoured to summarize as succinctly as possible the general principles adopted in using and 
marking the test, and the main conclusions so far emerging in what (it should be emphasized) are 
merely preliminary trials. ? 


ІІ. SUBJECTS 


The present investigation was based оп the performances of 100 candidates for 
the Administrative Class of the Civil Service. 


been published by Mr. N. A. B. Wilson (19), who was at that time Principal Psychologist. Briefly it 
included a written qualifying examination, followed by a two- or three-day examination during a 
kind of * house party ’ by a Civil Service Selection Board (which included psychologists and was 
known as CISSB), and ending with an interview by a Final Selection Board (FSB). A mark or 
grading Was given at each stage, but the ultimate decision rested with the Final Selection Board. 
This Board judged each applicant on his entire record, including the marks received in the earlier 
Stages, and his prowess at the final interview (20). Detailed results indicating the apparent value 
of this procedure and of its component parts (excepting the projection tests) have been given by 


. Тһе main criteria used in the present research were derived from reports sent 
In on successful candidates a year or more after their appointment. The reports 
were submitted on special ‘follow-up forms ^ containing spaces for (a) five-point 
ratings on the thirteen qualities believed to be most relevant to their duties, (5) an 
overall grading for general efficiency up to date, (c) an estimate of the highest rank 


E ШІСІ the officer might be successful (termed the ‘ final rank ?), and (d) a free реп 


2 ing: l. A young 
2. A boy holding a violin, with a girl 


by her.  Two-and-a-half minutes were allowed for ti looking up at a man standing 
about each, 


At the time of my enquiry this set of pict: 
for a detailed item-analysis to be made kei uid 


officers, of course, have dw was Working for the Selection Board, 
ached. по responsibility for any of the views here put forward or for te жіне, 
94 


AMYA SEN 


2 Mode of Scoring. As Rapaport observes (17), p- 423), the usual techniques for 
interpreting the thematic apperception test “ аге not hard-and-fast rules like those of 
scoring other tests ; they are rather viewpoints for looking on the stories, which have 
to become ingrained in the examiner.” Murray and his collaborators drew up an 
elaborate scheme of ‘ needs’ and * presses (which, as Cattell observes, is, in effect, 
an amalgam of McDougall's primary emotions with Freud’s theory of fantasy forma- 
tion); and “suggested and used depth interpretations which they compared with 
psychoanalytic interpretations of dreams ” ((18), p. 214). s 

. The immense number of headings involved in Murray's scheme of itself renders his method 
quite impracticable in any ordinary selection procedure ; and the peculiar terminology employed— 
* conservance,' * infavoidance,” * similance, 'succorance, and the like—is exceedingly difficult to 
interpret with any exactitude. Those who have endeavoured to improve upon it still choose their 
qualities or traits in the light of intuitive speculation or ad hoc improvisation, and so produce widely 
varying schemes for which there is really very little empirical support. Full details of the methods 
and of the modifications suggested by their followers are given by Rapaport and Bell, and sum- 
marized by Vernon. Vernon repeatedly stresses the extreme subjectivity of all such procedures ; 
and rightly adds that “ the lack of any generally accepted framework for the description of personality 
creates further difficulties ” ((22), р. 205). In addition it is hard to discover any precise information 
regarding either the reliability or the validity of the methods used by earlier investigators. 

During their work with school children, extending over more than forty years, 

Burt and his co-workers have evolved a structural scheme for the study of personality ; 
and in their researches on pictorial tests they have tentatively worked out a systematic 
procedure for analysing both the formal and material characteristics of the narrative 
compositions of children (9). It has been plainly influenced by the general practice 
of literary and biographical critics ; but differs in retaining the same scheme for all . 
the writers to be studied, in attempting to give a quantitative estimate for each 
characteristic, and above all in having been subjected to some degree of empirical 
verification. It employs a stock schedule for the classification of the commoner 
motifs (recurrent topics, subjects, or themes) according to the dominant emotions 
and salient interests (sentiments and complexes) which each appears to indicate, with 
Special reference to their dynamic interactions. 

The choice of traits has been based on the results of factorial studies ((8), (13), (16)), and checked 
by systematic comparisons with the after-histories of the cases tested ; the inferences drawn can thus 
be fitted into a recognized psychographic scheme available for other tests. It is often objected that 
assessments for qualities derived by factorizing tests or traits fail to give a true synthetic portrait 
of the individual as a whole. To overcome this difficulty, Burt also grades the traits in order of 
their prominence in one and the same person. This enables a factorial, “ psychogram ’ to be, con- 
structed, and yields a sounder estimate for what are sometimes called ‘ temperamental types ° and 
should perhaps rather be termed “ temperamental tendencies.’ * РА 
With a given set of pictures the commoner features of the stories which they tend to elicit are 
arranged according to the various cognitive or orectic characteristics they are presumed to indicate. 
Іп marking the scripts, a prepared schedule is generally used ; pluses аге entered in the appropriate 


Spaces for the several si ted in each candidate's script (pluses or minuses in the 
1 gns or symptoms noted in 1 Y f r 
case of bipolar traits). An allowance is given for over-all impressions. These scores are then added 


up, and treated as measurements for the several traits. 
No doubt with pupils from the elementary SC 
personality is much simpler than that of the highly, ‹ l 
it seemed profitable to enquire whether this scheme might not provide a better starting- 
point than the unverified speculations which seemed to form the basis of the 
merican techniques. In the investigations described below, an attempt was made 
to score the results of the test in accordance with these principles. The scheme 
Covers both cognitive abilities and orectic (affective and conative) characteristics : 
ut for practical reasons it could not be adopted in its entirety. In the present 
enquiry the figures quoted will be confined to those particular characteristics which 


hools the structure of the individual's 
hly intelligent adult. Nevertheless, 


! Cf. (13), and Factors of the Mind, p. 427, Fig. 1. 
95 


Cr 


A Preliminary Study of the Thematic Apperception Test 


seemed to be most closely related to the duties that would be required of the successful 


candidates and which previous attempts had shown could be rated with greatest 
reliability. ; 


IV. RELIABILITY AND VALIDITY OF 
ASSESSMENTS 


A. Reliability. The stories written by a hundred candidates were scored 19у 
Mrs. McArthur and myself. Further, the scripts of twenty candidates at an earlier 
examination, which I had already scored in the same way, were re-marked after 2: 
interval of nine months, when, owing to intervening work, I found I had completely 
forgotten my original impressions. The results obtained (a) for the Bop iod 
of the same interpreter and (b) for the reliability of the two independent assessments 


USA levels 
over 0-197 and 0-256 are significant at the 5 per cent. and the 1 per cent. 
Tespectively ; with a sample of 20, coefficients above 0-444 and 0-561 are significant 


TABLE L COEFFICIENTS OF SELF-CONSISTENCY AND RELIABILITY 


5 Self-consistency Reliability of 
Traits Assessed of One Judge Two Judges 
A. Cognitive AW 
l. General Organization of Story .. (assessed only once) 
Observation Sc СА i бА 492 287 
3. Verbal Ability ., di ЕЕ 2% "758 467 
Imagination s zy 33 £ :605 :588 
В. Ovrectic 
5. General Emotionality .. sā 45 “634 :395 
6. Extraversion-Introversion A » — :222 
th Cheerful-Depressive 25 zz ES “115 :628 
8. Self-assertive-Self-submissive Er a 426 466 
9. Anxiety ., AG e 22 2 — 451 
10. Decisiveness oe 26 gn D -560 418 
11. Ambition .. oe a 24 td “169. +295 
12. Social Consciousness ne d E), 804 -452 
13. Maturity-Immaturity 26 s e 7718 485 
14. Integration. . 22 Е i3 Өз 25 371 
Ауегаве *588 425 


1 ge from the more careful studies reported in the literature, the reliability of 
so-called “ projective tests of personality ’ seldom reac! es a figure of 0-400 бапа ‘this аз арро 
і i our previous attempts. The table above shows that all except one of the 
self-consistency coefficients, and nine of the reliability Coefficients, rise above that level. Somewhat 
ly assessed із Cheerfulness, The three poorest 
ed is non-signific t. Furth n, Ambition, and ET 

i 0 аа. і indi in these 

cases, the low coefficients were due to the fact that the two eaten indicated that, in 


tion in general life), whereas the other interpreted t е term 
observation of what actually appeared in the picture, In this a; 


th à other instances much of the dis- 
finition of the quality to be assessed and from incomplete 


96 


he ———— em 
سے‎ wer cub M eS 


AMYA SEN 


specifications of the relevant clues. The term Ambition, for example (described as ‘an ardent 
desire for personal distinction or glory °), is a somewhat ambiguous concept ; and the phrase “ Self- 
assertion versus Self-submission ” (described as McDougall defines those tendencies) appears to 
treat aggressive tendencies and self-assertive tendencies as almost synonymous : although correlated, 
it seems clear that the two should be assessed separately. Moreover, some of the pictures were not 
altogether suited to elicit trustworthy clues to the qualities required. But these are defects in the 
procedure which could doubtless be corrected after further research. 

B. Validity. In judging validity the criteria used were (a) the marks allotted by 
the Final Selection Board after the three-day examination and (0) the various marks 
and gradings provided by the follow-up forms. Data sufficiently complete for 
comparison were available for only 42 of the successful candidates. The scores for 
all the traits were first correlated with suitability as estimated (i) by the Final Selection 
Board; (ii) by the Over-all Grading, and (iii) Бу the estimate of ° Final Rank. Table П 
shows the results. 

With samples of this size, correlations above 0-304 and 0:393 are significant 
at the 5 per cent. and 1 per cent. levels respectively. The correlations are no doubt 
greatly reduced by the fact that those candidates who obtained low marks for desirable 
qualities failed to secure appointment ; they had consequently to be omitted from the 
calculations for validity. Any attempt to correct for this type of selection by statistical 
calculation seemed highly precarious. Hence the figures are printed as they stand, 
and must be compared with uncorrected coefficients given in other reports, not 
With coefficients already corrected for selection. 


TABLE II. CORRELATIONS WITH SUITABILITY 


D (ii) (iii) 
Traits Assessed ШЕ Over-all Final Average 
Selection . Grading Rank 
A. Cognitive ^ £ 
General Organization of Story .. 421 :318 "354 :367 
. Observation .. E m 5; :268 "331 401 "333 
3. Verbal Ability "M Sew us :344 +237 +359 313 
4. Imagination = 0 50 -328 -168 +254 +250 
В. Orectic 
5. General Emotionality eat ТӨРІ -100 :203 -143 
6. Extraversion-Introversion .. 5; :219 -118 -080 108 
7. Cheerful-Depressive ге г 117 :147 :183 :149 
8. Self-assertive-Self-submissive 2% —:070 —:228 —151 — +150 
9. Anxiety a E Ms, ee 5189. --130 —:104 —:128 
10. Decisiveness .. m M BS :198 :074 137 136 
11. Ambition Е b. 34 5 029 044 :027 :033 
12. Social Consciousness SS LS :037 -094 :207 ВТЕ 
13. Maturity-Immaturity r b. -309 +183 +335 -276 
14. Integration Ж 7 is +233 +299 “325 +286 
Average 9 204 170 231 201 
IC Jo So 


Note. In Tables II and Ш the negative signs have been reversed in calculating the averages at the 


foot of the columns. 
Owing to th Il size of the sample, only six traits furnish one or more correlations that are 
statistically iom and of these four are cognitive. Nevertheless, the figures are on the whole 
i i t of the selection traits used at CISSB (4), where the highest 


igher than tl iditi ined for mos b | 
O A with the criterion was 0-186 (for the ‘ Short Talk’). In general the estimate 
-— 
97 


A Preliminary Study of the Thematic Apperception Test 


of Final Rank correlates more closely than the other two. The highest average correlation is obtained 
by the marks for ‘ General Organization of Story.’ This is the only characteristic that provides a 
fully significant coefficient with all three gradings : it may perhaps be taken as a measure of the 


Imagination, Lack of Anxiety, Extraversion, and Decisiveness. The estimate of Final Rank, which 
gives a long-term view of a candidate’s suitability, correlates most highly with Observation and 


Since different Government departments require slightly different qualities, 
an attempt was made to determine how far these more specific qualities, as assessed 
by the supervisors, could be predicted by the marks for the several traits that approxi- 
mate most closely to these qualities. Table III shows the correlations between the 
most relevant assessments derived from the test and the nearest assessments appearing 
on the report forms. All the correlations point in the right direction. But the only 


Correlations that are significant at the 5 per cent. level are those for Observation and 
Imagination. 


TABLE Ш. CORRELATIONS FOR SPECIFIC TRAITS 


(а) Traits Assessed by Test (6) Traits Assessed on Follow-up Form Correlation 

2. Observation Attention to detail — .. i 22 фу :304 
3. Verbal Ability ; .. Paperwork (minutes and correspondence) .. :256 
4, Imagination RES .. Fertility of ideas E m = v 384 
6. traversion-Introversion New contacts .. 52 ғ. me ti "281 
6. Extraversion-Introversion Acceptability as colleague `` b Ў 116 
7. Cheerful-Depressive — .. .. Judgement  .. * b A 23 :196 
8, Self-assertive-Self-submissive .. Acceptability as colleague `` ар £ —:260 
8. Self-assertive-Self-submissive -. Management of subordinate staf .. E —204 
10. Decisiveness ... 2 Judgement — .. ae ds :264 
Average E ее Er 252 


g results take us, it may, I think, be reason 


ably claimed 
ere adopted for evaluating the results obt 2 


ith the Tesults of the synthetic judgements 


SIRE d alts that can be most reliably assessed 
selected yield the most valid ais ailed clues obtained with the pictures 


3 fertility of ideas, 
2 » с 

* personal ’ qualities (long-term contacts, new comae., ae grouped with the orectic or 

98 


A 5 


АМҮА SEN 


V. SUMMARY AND CONCLUSIONS 


_ l. The responses to a thematic apperception test obtained from a hundred 
Civil Service candidates in the course of the regular entrance examination were scored 
by two investigators according to Burt's method of rating for a systematic scheme of 
traits based primarily on the results of earlier factorial analyses. The correlations 
between separate assessments made by the same judge at an interval of nine months 
showed a moderately high consistency for nearly all the traits (average 0-59). The 
correlations between the independent ratings made by two judges showed that, when 
compared with similar coefficients for other projection tests of personality, or other 
methods of marking, the assessments obtained by this method possessed a degree of 
reliability that is encouraging, though still not high enough to have much practical 
value as they stand. Three traits at least seem to be ill-defined and possibly may 
not lend themselves to assessment by this type of test. The traits yielding the largest 
reliability coefficients were Verbal Ability, Tmagination, Cheerful or Depressive 


tendencies, Self-assertive or Self-submissive tendencies, Anxiety, Decisiveness, Social 
Consciousness, and Maturity versus Immaturity. 

2. The validity coefficients are higher than those obtained with other tests of 
the same characteristics and other methods of marking, and are superior to those 
hitherto published for similar assessments of Civil Service candidates : the correlations 
between the assessments based on the test and the various criteria available suggested 
that, of the traits assessed, those most closely related to suitability for administrative 
work appeared to be Ability for Coherent Organization of the Story, Observation, 
Verbal Ability, Imagination, Maturity, Cheerfulness, and Integration of Personality. 
Correlations with reports on the successful candidates’ efficiency sent in a year or so 
after their appointment indicated that several of-the more specific traits could be 
assessed with a promising degree of accuracy. 

3. The enquiry as a whole indicates that both the schedule of traits and the 
method of scoring form a distinct improvement on those adopted by previous British | 
workers (for the most part borrowed, with some simplification, from American 
contributions). At the same time the detailed results reveal that certain features 
in the procedure leave considerable room for improvement ; and a study of the chief 
sources of discrepancy has suggested what will probably be the most fruitful directions 
for future research. It seems clear that the efficacy of the thematic apperception 
test can be greatly improved if the traits to be assessed are based on an acceptable 
theory of personality-structure, and if the various clues used in the marking-scheme 
are selected on the basis of a systematic jtem-analysis and are explicitly specified and 


defined. 


REFERENCES 


1. Bryant, S. (1886). “ Experiments іп testing the character of school children.’ J. Anthrop. Inst., 


XV, 338-49. . 

Stout, G. F. (1888). 'The Herbartian Psychology." Mind, ХШ, 321-38, 473-98 ; XIV, 
1- i 

2 лалы dividuelle.’ Алп. Psychol., Ш, 296-332. 


. Binet, A., and Henri, V. (1896). * Psychologie in 

z Binet, А., and Simon, Т. (1908). ‘Le developpemei 
sychol, XIV, 1-94. | Я 

. Adı ý ) tian Psychology applied to Education. London : Heath. 

р Фата 1. (1910), отор Ваа Mental ‘and Physical Tests. Baltimore : Warwick and 

York. : : 
р Burt, c and Moore, R. C. (1912). ‘The Mental Differences between the Sexes.’ J. Exp. Ped., 
1, 273-84, 355-88. 


nt de l'intelligence chez les enfants.’ Ann. 


м сә BO N 


99 


A Preliminary Study of the Thematic Apperception Test 


8. Burt, C. (1915). ‘General and S 


pecific Factors underlying the Primary Emotions.’ Brit. 
Ass. Ann. Rep., LXXXIV, 694-6. 


9. Burt, С. (1934). “Тһе Interpretation of Pictures,’ 


10. 
11. 


12. МІ 


. Burt, C. (1945), * The Assessment of Personalit 
. Bell, J. E. (1948). Proj 


- Vernon, P. E. (1950). * 
- Sen, А. (1950). 
21-39 


E Vernon, P. Е. (1953). Personality Tests and Assessments. 
5. 


Unpublished Laboratory Notes. University 
College, London, 


Burt, С. (1935). The Subnormal Mind. London: Oxford University Press. 


Morgan, C. D., and Murray, H. A. (1935). “А Method for Investigating Fantasies: the 
Thematic Apperception Test. Arch. Neurol. & Psychiat., XXXIV, 289-306. 


штау, H. A., et al. (1938). Explorations іп Personality. New York: Oxford University 


factorial analysis of emotional traits.’ Character and Personality, 
ҮП, 1-238, 285-99, 


. Earl, C. J. C. (1939). * Methods of assessin 


g temperament and Personality,’ ар. Bartlett, F. C., 
et al. The Study of Society (ch. X). London: Kegan Paul. i” 
е Thematic Apperception Test.” Unpublished Laboratory Notes. University 


у.” Brit. J. Educ. Psychol., XV, 105-21. 
Rapaport, D, (1946). Diagnostic Psychological Testing. Chicago: World Book Company. 
lective Techniques. New York : Longmans, Green & Co. 
Wilson, SE INE B: (1948). “Тһе Work of the Civil Service Selection Board.’ Occup. Psychol., 
The Validation of Civil Service Selection Board Procedure.’ Occup. 
Psychol., XXIV, 75-95 


“A statistical study of the Rorschach test.’ Brit. J. Psych., Stat. Sect., Ш, 


London : University of London 


100 


Vol. VI The British Journal of Statistical Psychology November. 
Part II 1953 . 


THE FACTOR ANALYSIS OF MATRICES 
OF NEGATIVE CORRELATIONS 


By PATRICK SLATER 
Bedford College, University of London 


I. Introduction. Il. Statistical Conditions for Negative Matrices. Ш. Experimental 
Conditions. IV. Analysis of New Example. V. Analysis of the Previous Example. 
VI. Methodological Discussion. VII. Summary. 


I. INTRODUCTION 


In the discussion of a matrix of predominantly negative correlations in a previous 
paper (1), a first attempt was made to define the general conditions under which such 
matrices might occur. Taking into consideration another instance of a slightly 
different type (cf. Sect. III), a more adequate definition can now be reached. 


psychological experiments where negative matrices were likely to occur. But it was 
uld be eliminated 


by introducing a transformation. 

Burt, in a Reply in the same issue (2), expressed the view that a factor analysis 
would be possible without the extraction of any imaginary factor, and he doubted 
whether any preliminary transformation was required. He gave a factor analysis 
of my figures, but did not include a complete answer to the question discussed in 
detail here, how factor analysis should be applied to such matrices. However, the 
main conclusion now reached is in close agreement with Burt's original opinion. It 
is in fact shown that an efficient analysis is possible without transforming the matrix 
or extracting any imaginary factor, a transformation being only required in the test 
of significance. . , А 
Both the original example and ће one adduced here are instances o matrices 
of variances dud eee variables in a common metric, which do not need to 
be standardized. This permits the use of a variant of Bartlett's test ((3), (4), and 
(5)) which has theoretical advantages bor appropriate only a PUT ae 
is Е : Б ization experiments involvi » > 

applied without prior standardization. In ехр nents оа ooour, а COMMON 


and compari here negative matrices аге partic ur, 
parisons, w B he simultaneous variations ; and 


metric can frequently be used for recording t 1 
an illustration n the aiment of unstandardized matrices may be found helpful. 


101 


The Factor Analysis of Matrices of Negative Correlations 


Il. STATISTICAL CONDITIONS FOR NEGATIVE 
MATRICES 


Suppose a matrix of correlations has been obtained by measuring m variables, 
A, В, etc., inn cases, the observed measurements in a particular case being Ao, bg... my. 
The sum of these measurements, 5o Say, will usually also vary from case to case, and 
may be treated as a further variable, 5. If Va, Vy, etc., denote the total variances of 
the variables and Wag, etc., their covariances, we may write XV = V, ee ac NE 
and TW = Wa +... Win (summing the covariances without duplication) and shall 
then obtain, for the total variance of S, V, = XV + 25 W, i.e., the sum of all the 
terms in the square matrix of total variances and covariances. When SW is negative, 


ie., when XV > V. the matrix of correlations wil] be preponderantly, though not 
necessarily completely, negative. 


This may happen if the conditions of the ex 
only indirectly restricting A, B, etc, If S is 
ZW = — 45V. This is evidently the negative limit of ZW when A, B, etc., are real scalar measure- 
ments. For if it exceeds this, V, must be negative; but as V, 
negative unless some of its component terms, such as so, are imaginary or complex ; and as s, too 
is a sum, this implies that some of its components, do, etc., must be imaginary or complex also. In 
short, some at least of the dimensions in which variati 
or complex ab initio. Matrices of this kind are not discussed below. 
example. But there would not appear to b j i 
factors from them. 


Where V, = 0, the conditions of the experiment mı 


— l/(m — 1). Tests of significance are complicated 
of latent roots al 


EE EXPERIMENTAL CONDITIONS 


. A simple instance о 
Is provided by the foll 


The number of ears th 
(outside the service), © ci 


= 


M T a ani 


P. SLATER 


the age for admission to the duty is restricted, those who have occupied more time in one way will 
Have passed less in another ; and the three variables are negatively correlated. Table I gives the 
an Statistics. 


TABLE I. ANALYSIS OF DATA FROM MEN IN THE FORCES 


A. Means B. Mean Square С. Correlati 
Variable Trans- Re- Variances and Covariances eer a нола 
ferred entrants (d.f. = 69) 
(n= 54) (n=16)| E. C. E: B; С; Е 
(Е) Education 1-65 2:25 1:5912 —-9213 —1-3582 — —3340 —-4078 
(C) Civil employment 1:39 419 |- 79213 47818 "6431 :3340 1114 
(F) Service 5-13 3-19 |—1:3582 —'6431 69723 | —-4078 —1114 - 


When the simultaneous variation of a number of variables is subject to the condition that some 
cannot increase unless others diminish, a choice between the variables must occur at some point. 
This may not always be the deliberate choice of a human being, but in psychological experiments 
it often is. Thus in this experiment an element of choice may be found in the way in which each 
man decided to employ his time : the statistics represent the accumulated effects of such acts of 
decision. It may also be found in the alternatives open to the selection board, who cannot simul- 
taneously require extensive experience in many different fields if they wish to recruit men still in the 
prime of life. 

. In psychological experiments with methods of ranking or paired comparison the entire variation 
is usually attributable to acts of deliberate choice. In preferring one of the alternatives presented 
to him the subject declines another. Correlating preferences for different alternatives thus gives 
rise to predominantly negative matrices. Face to face with death there is not much deliberate 
choice open to an individual, but the situation is analogous in that by dying in one way he escapes 
dying in any other. A choice is made for him if not by him. Thus negative correlations are commonly 
found when fluctuations in mortality from different causes are correlated over a period during which 
the numbers and age composition of the population are tending to remain constant. 

. lf such analogies are admissible, it would seem reasonable to enquire after an element of choice 
in the experimental conditions whenever a negative matrix is observed. Conversely, when an experi- 
ment involves some such element, some approximation to a negative form of matrix may be expected, 
and the method of analysis adopted should take this into account. А жұл? 

The mortality statistics illustrate another point. If the period under consideration is one when 
the numbers and age composition of the population are undergoing substantial changes, years when 
deaths from any particular cause are common are likely to be years when the total number of deaths 
is large; and a general tendency towards positive correlation may appear among the causes. In 
the former case the total number of deaths from all causes (S in the notation of Sect. II) is not free 
to vary widely ; in the latter it is. The transition from predominantly positive correlation matrices, 
yielding positive general factors of the familiar kind, to predominantly negative ones, productive 
(when improperly analysed) of imaginary general factors, may thus be seen-to be gradus and subject 
to control by any processes or conditions which expand or contract V, relatively to UV. 


IV. ANALYSIS OF NEW EXAMPLE 


1 of the total dispersion matrix in Table I B 
ate were found. Its axes, computed by the 
und to be as shown overleaf. Since the 

(number of completed years) they were 


When a three-dimensional model c 
Was constructed, no startling peculiarities 
method of principal components, were fo 
variables were expressed in a common unit 


eaa. discriminantal analysis. The functi 
The red with those obtained by a discriminantal analysis. function 
лат ка E. 
tad Ej H : б cent. of 1 s 8 54172. 
pean Gana Au ed of the matrices of variances and covariances, total and within 
groups, have the corresponding ratio, 11839632 : 6365674. 


103 


The Factor Analysis of Matrices of Negative Correlations 


Factor loading 


Variable First Second Third 
E) Education .. es .. — 59422. .. — -6668 :9234 
(С) Civilian employment .. — "4606 2-1213 -2638 
te) Service 5 3 2-6193 2360 :2369 
Latent root 7-3667 5:0004 9783 


In internal factor analysis it is generally assumed that if the individuals measured differ in some 
Tespect, and the БН НЕ recorded are sensitive to the difference, the major dimensions along 
which the individuals are found to vary may be used to infer and define the nature of the difference 
between them. This is the basic factorial hypothesis popularly translated into a belief that factor 
analysis can be used to prove or disprove the existence of psychological traits. "eT 

The major axis of the total dispersion matrix should accordingly approximate to the axis along 
which the dispersion between the groups is greatest. Actually the two axes diverge considerably. The 
major amount of variation between the individuals is in their length of service (cf. the first factor, 
above) which involves, as it increases, a diminishing length of time spent on education or in civilian 
employment. The variation between the groups, however, is mainly due to differences in the amount 
of time they have engaged in full time education or civilian employment, especially the former (the 
discriminant weights need to be related to the mean square variances, Table I В). Pooling square 
methods may be used to show that the two axes correlate by :57, i.e., lie at an angle of approximately 
55* to one another. The difference between the groups contributes only 14:6 per cent. of the total 
variance of the first component, and slightly more, 18-1 per cent., of the second. 


V. ANALYSIS OF THE PREVIOUS EXAMPLE 


The data given in the previous paper were obtained by asking five groups, sales- 
women (S), light factory workers ( 


( F), teachers (T), nurses (N), and clerks (C), 18 
Women in each group, to score the five occupations for smartness of appearance off 
duty by the method of paired comparisons. Here So the sum of the scores assigned 
by each subject, was constant and consequently V, = 0. 

The matrix of mean square variances and covariances is shown in Table II A 
below (obtained from the sum of the two matrices given in Tables III A and III B 
of the original paper). The principal components are given in Table II B. 


TABLE Il. ANALYSIS OF DATA FROM EMPLOYED WOMEN 


SS SS SS eee UT 
Variable А. Variances апа Covariances (d.f. = 89) B. Principal Components 
Factor loading 
5. Hu T. N. С. First Second Third Fourth 
8 50164 3904 —37686 - 158382 — -3000|—1-9044 4136 — 2329 10191 
Е = 32265 —22000 —1-6427 — -2742 |—1:0779 — 22552 12196 — -7041 
T = m 6020 27034 — «7618 | 21845 — 2602 3727, 1336 
N L = Z 42517 —14742| 10983 1.6652 — 2513 — 4576 
— = — — 28102 |— -3005 -1-0334 —1-1081 — :6512 
Latent 
ases X 5 = — .| 108572 4276123 29716 28319 


Teaching is associated with nursin; 


g at one 1 i aH ith 
factory work at the other; the diversit inion оГ the major axis of variation, and sales wit 


familiar and may be treated as a factor. 


104 


y 


of o 
The 


pini 
fi 


оп about the relative status of the 
rSt two factors have loadings compa! 


i s 
professions 1 
rable with those 


UMP, a 


4 


P. SLATER 


obtained by Burt (2) with a different procedure. The analysis concludes with the extraction of 


four factors, the fifth having vanished. 
The two major components Ол and 24) coincide closely with the two principal axes (ші and шь) 
of variation between groups, previously published. Their inter-correlations are as follows : 


т Us 
2S —-9261 -2998 
De “3733 :8055 


The basic factorial hypothesis (that the difference admitted by the sampling will be located by the 
analysis) attains a much greater validity here. 


VI. METHODOLOGICAL DISCUSSION 


Factor analysis is usually applied in experiments in which variation has been 
recorded in a relatively large number of original dimensions, with the intention of 
showing that a smaller number of derived dimensions will account for most of it. The 
dimensions in which variation is greatest are found first (the method of principal 
components being the most efficient) ; and the remainder may be presumed to be 
dispersed by chance approximately equally among the other dimensions. Bartlett 
uses this hypothesis for his test of significance. : 

Without modifying the intention of the experiment, the introduction of the 
restriction described above modifies the outcome. As V, diminishes towards 0, 
the dispersion contracts in one of its m dimensions, and finally collapses into a hyper- 
plane of m — 1 dimensions. The latent root thus reduced recedes to last in order 
of extraction before ultimately disappearing. Thus in the first example the last latent 
root is relatively small; and in the second it has gone. 

It seems doubtful whether Bartlett's test is suitable when V, is partially restricted. As the signi- 


ficance of each successive component is judged by comparison with the remainder, a disproportionate 
reduction in one of the roots must enhance the apparent significance of its predecessors. When it has 


vanished, this difficulty disappears. i 
The test assumes its simplest and most general form when applied to a matrix of unstandardized 
variables, such as Table П. After k roots have been extracted the reduced x? needed for testing the 
significance of the remainder is 
Mp in -i Qp + 1 + 21) — 24 ou ы. 

with } (p — k — 1) (p — k + 2) d.f: Here n is the d.f. of the variables, and p the total number of 
non-zero latent roots (ie., т — 1 in the notation of Sect. II). Рр-» corresponds to Rp -# in (3), 
the latent roots of the unstandardized matrix being scaled down to an average of 1. 

For the additive table of x? in Table III the coefficient of loge V; has been used for all the values 
of Vp-x. The first latent root is the only one that appears to be beyond the limits of chance. 


TABLE Ш. SIGNIFICANCE TESTS 


Latent Root Analysis of ¥* ` Values of Vp - 
d.f. x^ 

2 20273 4 48:90 V,:52884 

Aa “8890 3 7-59 V; 91747 

Ay :5549 2 :05 Va 799942 

A, "5288 = -- — 

Total 9 56:54 = 

If th i lied before all the latent roots have been extracted, their mean, needed 
to scale thon oen: Dee ni from >Р/р, and their product, needed for Vp, from | T"MT |. Here 
M is the matrix to be analysed, ZV is the sum of the terms in its leading diagonal, p the number of 
latent roots, and 7 the transformation matrix quoted in the previous paper, with transpose T". 
105 


The Factor Analysis of Matrices of Negative Correlations 


Such a transformation may thus be used for the test, but is not needed for the analysis. When 
the components are extracted directly from M, their loadings are related directly to the original 


dimensions, and are thus easier to interpret. The use of T is therefore not recommended except 
for the test. - 


If a disproportionately small latent root is suspected as а result of a restriction on V,, it could 
be eliminated by partialling out the variance of 5; and the analysis would then follow the second 
model. To treat the matrix in Table ІВ in this way, chronological age would be held constant. 
But such treatment might not be suitable or even clearly intelligible in every instance. ч 

The factorial hypothesis would not ordinarily be required when the origin of the heterogeneity 
in the sample is known independently, as in these examples. It is more likely to be helpful when it 
is known that the sample is heterogeneous in some respects but the individuals composing it cannot 
be classified independently. However, its application to such samples as these illustrates its validity, 
which is of course limited and varies from one experiment to another. 


VII. SUMMARY 


1. A matrix of negative correlations is defined here as one derived from exclusively 
or at least predominantly negative covariances. Such matrices can occur even when 
the variables concerned yield only real scalar measurements, provided the sum of the 
measurements in each case can vary only to a limited extent if at all. They are 
particularly likely to occur in experiments fixing a limited number of alternatives 
between which a choice has to be made or imposing an analogous set of conditions. 
The transition from a predominantly positive to a predominantly negative matrix 
occurs gradually as the limitations become more rigid. 

25 EO postulate an imaginary general factor in analysing them is unnecessary 
and confusing : their principal components can be found directly ; no transforma- 
tion is required. But the fact that as the transition fro 


e піса! ied ; and here a transformation 
may be useful. No direct solution is offered for the problem of testing significance 


when it is merely reduced, but it may be avoidable in some instances by partialling 
the root out completely before proceeding to analyse. 

3. Circumstances in which factor analysis might be applied have been described ; 
and certain problems involved in determining the psychological connotations of fac- 


tors, and in assessing their validity; have been briefly discussed in connexion with 
certain concrete examples, 


REFERENCES 
1. eae ee RD. The transformation of a matrix of negative correlations.’ Brit. J. Psych., 


2. Burt, С. (1951), * Reply to Mr. Slater's Criticisms.’ Brit J. Psych., St. 

y ater cisms, - J. Psych., Stat. Sect., IV, 18-20. 
ah patet s 5. (1950). * Tests of significance in factor analysis,” Brit. J PE Stat. Sect., ІП, 
4, Bartlett, M. S. (1951). 


Star. Sect., IV, 1-2, A further note on tests of Significance in factor analysis.’ Brit. J. Psych» 
. B tt, M. S. Pur izati e САМ pe 
2. ro RE Sauer de ета of standardization on a X* approximation in factor analysis. 


106 


7 


r^ 


Vol. VI The British Journal of Statistical Psychology November, 
Part II 1953 


NOTES AND CORRESPONDENCE 


ON THE OBLIGATION TO REFUTE RIVAL HYPOTHESES 
By L. F. RICHARDSON 


. , In the recent controversy between Dr. H. J. Eysenck and his reviewer (W. L. G.), much emphasis ` 
is laid by the latter on the need to disprove alternative hypotheses (this Journal, V, pp. 208f. ; УІ, 
pp. 44f.). I am impressed both by the importance of this requirement and by the difficulty of 
satisfying it. The practical difficulty is obvious : for how can anybody be certain that he has 
thought of all the possible alternatives ? They can be so extremely diverse. Take, for example, 
a long-lasting theory, that of Sir Isaac Newton about the motion of the planets. At the time when 
it was first discussed, permit me to imagine that there may have been people whose alternative was : 
“ God created the planets ; the planets obey their Creator ; this view of their movement is compre- 
hensive and satisfying ; Mr. Newton's notions about attraction and momentum are superfiuous.” 
Even to-day it would be difficult to refute that alternative : all we can say is that it yields no predic- 
tions. In the next century Clairaut's alternative theory was that the attraction did not vary precisely 
as the inverse square, but as some slightly different function of the distance (H. Lamb, Dynamics, 
Cambridge, 1920, p. 271). From the year 1916 the much-discussed alternative theory was Einstein's, 
and it was strangely different from any alternative that had been proposed in the previous two 
centuries. Sir Harold Jeffreys, after mentioning some small discrepancies between theory and 
observation that still remain unexplained, says: “ There has not been a single date in the history 
of the law of gravitation when a modern significance test would not have rejected all laws and left 
us with no law. Nevertheless the law did lead to improvement for centuries . . ." (Theory of 
Probability, 2nd ed., p. 362). 1 

As a consequence of W. L. G.'s insistence, I try now to refute the alternatives to my theory of 
arms-races (this Journal, current number, p. 89). Very briefly my own theory is : “The present | 
arms-race is due to mutual stimulation.” What are the alternative theories ? The theory prevalent “> 
in Britain and U.S.A. is: “ The present arms-race is entirely caused by the aggressive policy of 
Russia and her satellites." Another theory, prevalent in Communist countries, is that “ the present 


arms-race is entirely caused by the aggressive designs of the capitalist countries.” I do not indeed 
feel able to refute these theories one at a time: but do they not refute each other, at least as far 
is left of each, when “ entirely ' is omitted, is the 


as the assertion * entirely ” is concerned ? What is 
mutual stimulation which 1 һауе formulated. 


REPLY BY THE REVIEWER 


I am glad that so experienced a scientist as Dr. Richardson has taken up the questions raised 
y the recent enthusiasm of certain psychologists for a purely * hypothetico-deductive methodology, 
and that he agrees, at any rate in principle, with what I said about the need to refute alternative 
hypotheses. I willingly accept his main criticism of my argument, namely, that it failed to recognize - 
the practical difficulties involved in trying to satisfy the logical requirements. If I did not explicitly 
reply to Dr. Eysenck was already too long, and partly 
already expounded at greater length 


refer to this point, that was pany pean my Da 
ecause, in this portion of it, I was merely summarizing 1 1 
by ur in The Factors of the Mind (pp. 31-47). May we hope that he himself will deal more fully 
With the i EP ; 5 ЖЕ. » 
Dr. Reon ine Dr, Eysenck and several other writers, cites Newton’s опері, аѕ amoda е 
and on this point I had, in a previous contribution (this Journal, У, р. 208), азор | an ІНЕ ly 
endorsed Burt’s remark deprecating the unfortunate Way in which “ earlier ab OE ume, 
Adam Smith, Bentham, and James Mill—all, explicitly or implicitly, E E BO gu 
deductive method for their model.” Obviously, however, I should have offere: evi ence or this 
View. There are several reasons why Newton's method seems Psa ye e EU E EC OT 
(1) It is generally agreed that in the Principia Newton deliberately decided on a hyp 2 
deductive method of exposition, because Зе thought that would be the most convincing way to 


o Dr. Richardson for reading my note in draft, and 
Professor Burt has also Hoya me to Toake use a 
i i icati ence to the 
S lecture-not ces and illustrations. But for the applications made in reter 
present GREE reference blê Incidentally, I am puzzled by the pean one S 
to * Bacon's method’: Jevons, de Morgan, and Welton all assure us that “it has nor been followe 


by any of the great masters of science” (Principles of Science, р. 507). 


^1 am indebted both to Professor Burt and t 
Or their written comments and corrections : 


107 


Notes and Correspondence 


is theory to his own contemporaries : instead of “ the new inductive method,” he adopted 
pd SEC method," which started with definitions and axioms (or “ laws ”), and then 
deduced a succession of theorems by a train of demonstrative reasoning. This would be the only 
orthodox procedure for the Cartesians, who would form his main critics. И 
(2) The phenomena with which Newton was concerned іп the Principia were far simpler than 
those dealt with by the student of individual psychology. The number of variables involved— 
a few large bodies each possessing mass, and moving with the passage of time in three-dimensional 
space—was sufficiently small for a comparatively exact analysis to be made by relatively simple 
mathematical procedures. The most rigorous behaviourist would never attempt to “ deduce” 
the orbit of a golfer hunting for his ball from three hypothetical laws of motion and one universal 
principle in the way Newton attempted to “ deduce ” the orbit of the moon. 
(3) What Dr. Richardson calls Newton's “ long lasting theory about the motion of the planets ” 
is not a * hypothesis,’ in the sense in which I myself was using that highly ambiguous term. I was 
thinking rather of what Dr. Eysenck had called a * hypothesis,’ namely (in W. E. Johnson’s phrase), 


* the initial formulation of a general proposition which it is proposed to establish." In the termino- 
logy adopted by Kneale (Probability and Induction, 1949, pp. 92f.), Newton’s theory of universal 
gravitation is a * transcendent h: 


i t d pothesis ’ to be established by.‘ secondary induction’: we accept 
it chiefly because it co-ordinates a host of empirical generalizations by means of a simple and fruitful 
set of postulates. But what I had in mind was an “empirical generalization ' of comparatively 


harrow rango such as might be established by * primary induction’ as the result of a single factorial 
research. 


. (4) Nevertheless, it is not quite ri 
disprove alternatives. Some В; 


е c as a serious rival to his own, namely, Descartes’ theory 
of vortices ; and a considerable part of Book II of the Principia is devoted to showing that this 
theory is inconsistent with Kepler's laws. 


2 d : 
of his own experimental researches (Phil. Trans., 1672, S gm 3075). 9 ӨТ pos pee 
light (as expounded, for instance, in Kepler's Dioptrice) Supposed that pure light was essentially 
white, and that colour was a qualification of light ” derived from the process of refraction. To 
test this view Newton began by making a small circular hole in the window-shutter of a darkened 
٤ X of sunlight. i i ircular 
patch on the opposite wall. Не then placed a prism near the ern 5 доза ораш S ed 
caged ate | “ circular,” though sues 

” of * » with colour. Actually the prism produce 
a coloured spectrum ” of “ап oblong form." He then conside: Ы 

2 Pt ۲ tai 

hypotheses would suffice to explain these peculiarit ра M the caper cena 
1 No doubt on many 


аана отра, of these points casual anticipations could be found in the work of da Vinci, 


- 


108 


Notes and Correspondence 


through the prism, did not move in curved lines." By varying the size of the hole and the thickness 


. of the glass, he was able to eliminate each опе. “ The gradual removal of these suspicions,” he says, 


“Jed me to the experimentum crucis." This consisted in placing a second board, containing another 
small aperture, beyond the prism ; and then “ turning the prism slowly about its axis." Не also 
subjected the separated rays to a second refraction, showing that each preserved its colour, and 
that together they could be recombined to give white light. The various results conclusively disproved 
the corollaries from the accepted theory, and confirmed the corollaries deducible from his own 
hypothesis, namely, that “ light itself is a heterogeneous mixture of differently refrangible rays " and 
that “ the same degree of refrangibility ever belongs to the same colour." It is noteworthy that Hooke 
criticized Newton's logic on the ground that the explanation reached was still “ not the only hypothesis. 
... The same phenomenon will be solved by my hypothesis as well as his." In particular, as is 
now fully realized, Newton in his discussion about the phenomena of light (e.g., those of polarization) 
failed to consider that the undulatory hypothesis might take two forms : the waves need not be 
longitudinal (as Huyghens and Newton assumed); they might also be transverse (as Hooke 


tentatively suggested). 


certain.” We can only strive to reach the most probable conclusion. For this purpose we must 
begin by “ considering the antecedent probabilities of the competing hypotheses." These may differ 
so widely as to justify us in abandoning any attempt to examine * all the possible alternatives," and 


(like Dr. Richardson's theological alternative) can be neither confirmed nor confuted, then it is not 
a hypothesis that the empirical scientist is called upon to consider. Best of all we may be able to 
“ plan the research so that only two alternatives are left open, namely, the hypothesis to be proved 
and what is commonly called the null hypothesis." ta 
Burt illustrates! his argument by deriving (from the familiar rule for compound probabilities) 
the following formula : í 
1 + 
Рада 
where а denotes the hypothesis to be established, л its contradictory (e.g., the null hypothesis), p the 
initial probabilities, g the likelihoods after the experiments һауе been made, and Pa the final or 
posterior probability. We may assume that neither pa nor Ра is Zero, and (in many cases at any rate) 
that they do not differ greatly from each other: otherwise we should hardly have embarked on the 
research. But, except perhaps in Mendelian genetics, they can seldom be determined precisely in 
any psychological enquiry. If the contradictory hypothesis had the form of a universal proposition 
(e.g., there are no group factors underlying psychotic conditions), it could conceivably Бобота 
outright by a single observation ; qn would be zero, and P, = 1. But this is ON сн gain, 
if our experiment can be planned so as to verify a corollary that can be deduced logically from 
hypothesis a, then ga will be 1; and Ра will depend mainly on the size of gu, This is the situation 
that we aim at when we try to establish hypothesis a by disproving Fisher's null hypo ties ger 
the hypothesis that the results obtained are attributable to chance). It is obvious Nu 3 
probability may in some measure be improved either by seeking merely to confirm a (m ing 221414) 
n (making gn small) : but the improvement m аша X e 
small, and in general we are not likely to raise Ра much above half id b nm ето bot 7 ^ 
arti i i were con Я 
In the particular problem with which Dr. SHETE sane аа DOS an а е same E derlying 
condition, i.e. differences are quantitative, not qualitative.” Here I fancy most psychiatrists 
would hold that P Beers higher than pa. In. any case n Spes О СНА сн ype tests 
DIESE t should hg posue 
B i alitatively different types of psyc osis, 
to plan a factorial research so (ee анас а ШЕТЕН; significant ; while if there 


Pa= 


were i , even with large samples, a 

case e Should Me ааа qn). I would also add that, to carry complete conviction, the research 
i i i ive only: “ Kneale has argued 

MU i ns that he regards his formula as illustrative o < 

thatthe proba of inductions is nof the e ie ther exactly z oí e dicis T hold that the 

deci i we could not com xactly : пе , Ih 

eae pee scu the beds in which what Kneale calls the * degrees of acceptability > may be 


lowered or raised : (cf. Probability and Induction, рр. 211f., 250г)" 


109 
D 


Notes and Correspondence 


alternative hypotheses, not only by factorial procedures, but other methods as well ; 
ae a е аети ES fails to find qualitative differences, the psychiatrist will still wonder 
whether clinical, biochemical, or therapeutic methods might not have been successful. In short, 
as Kneale puts it, ** the scientist's aim is R таке a theory which fits ali the facts, and at the same 
ime has no rival” (op. cit., p. 253 ; his italics). 2 4 1 i 
ұлы Iam not GHI Sete with Dr. Richardson's ingenious way of getting rid of rival theories, 
namely, by a kind of mutual extermination a priori. It would certainly provide a quick solution 
to the controversy between Dr. Eysenck and myself. Following Dr. Richardson I could argue that 
“the theory prevalent in the U.S.A.” is that all correlations between mental traits are “ caused 
entirely " by group factors ; and that another theory, prevalent in monistic circles, is that they are 
“ caused entirely " by a single general factor. Then I could ask : “do they not refute each other 
at least so far as the adverb * entirely ’ is concerned ? ” If so, what is left would bea general-plus-group 
factor theory, such as I have supported. In this manner the problem would be disposed of straight 


away, without the need for any empirical evidence whatever ! But would Dr. Richardson be 
convinced ? 


In his own interestin 
three hypotheses which s 
shown that the results of 
issued in absurdities, 


g investigation (if I understand his reasoning correctly) he has taken the 
cem most plausible a priori: then, on deducing their consequences, he has 
the third seem to agree with facts, whereas those of the first andthe seeond 


Editorial Note.—The correspondence on this s 
letters, asking for further references on the issues raised. On the fundament 
method, there can be little to add to the full discussions in Russell’: 
* Postulates of Scientific Inference ”) and the works he there quotes 
Here it is scarcely possible to reply in detail to the questions put by the various correspondents or by 
the reviewer himself in his i i ini 


it may save some misunderstanding if I briefly indicate one or t 
disagree. 


I cannot help feeling that, in restricting himself chiefly to phy: 


procedure most appropriate to psychology, he has laid himsel 
criticism that he urged against Dr. Eysenck. Jeffreys, in the ра 

shown that the methodology of the earlier physicists was by no means so faultless as their psychological 
imitators have supposed ; and certainly it is hardly the methodology most suitable for the psycho- 
logist. For early examples of the kind of reasoning appropriate to psychological experiments, I 


should prefer to go to biological writers, like Harvey or Cesalpinus, rather than to physicists like 
Newton or Galileo. 


sicists for illustrations of the logical 
f open to much the same kind of 
ssage cited by Dr. Richardson, has 


i methodological issues to the fore. 
As one of the earliest applications of the * F 


Ав. ) 0 method of exclusion,’ I should be inclined to cite, not 
Galileo's Demonstrazione Matematiche (1638), but Harvey’s De Motus Cordis et Sanguinis (1628).* 
The opening sentence of his introduction declares that it will be imperative to begin by stating the 
* Bertrand Russell, for example, writes; “ Scientific method, as we understand it, c into the 
a ful fedaed with, Galileo * (The Scientific Outlook, p. 22: cf alto Laie KO по ће 
Theory of Personality, ch. 1, ‘The Conflict between Aristotelian and Galileian Modes of Thought 
in Contemporary Psychology "), 


- Just as English writers. from Hume onward, have urged that 
201 : ie ya 
empirical psychology should take Newton for its model, so continental writers have put forward 


Galileo. Lewin is of course fully justified in his censures on Ari j 7 - 
: ы ristotle’s treatment of physical prob 
lems; but that has far less. relevance for psychology than psychologists are apt to suppose. In 
an je. Cine oa A! biological ораз (of which his De Anima was a part) I prefer Darwin’s 
ee Я ler тау haye been 2 5 

to old Aristotle " (Life and Letters, III, p. 252). my gods, but they were mere schoolboys compared 
2 Cf. Duhem, P., Etudes sur Leonard de Vinci 
Scientific Method in the School of Padua,’ Hist. of Ideas, 1, 1940. 

** Le plus beau livre de la Physiologie,’ as Flourens 11 эт З a Padua 
together ; and ti iversi Э s called it. Harvey and Galileo were at Pad 
feld. MR ауа омега ав already produced several revolutionary writers in the biological 


110 


Notes and Correspondence 


chief. alternative suppositions “ in order that the false [e.g., Galen's hypothesis of ebb and flow] may 
be disproved, and the true [viz., Harvey's alternative hypothesis of circulation] may be confirmed, 
by accurate Observation and multiplied experience " ; and his unexpected introduction of quanti- 
tative proofs into what is a problem in biology rather than physics is a particularly instructive feature 
of his reasoning. с 

_ Ав regards the explicit theory of inductive methodology, there is ample evidence (in spite of the 
reviewer’s objections) that the change in standpoint was mainly due to Bacon, who in later years 
was Harvey's patient. Bacon is the earliest writer who definitely insists that “ the first duty of induc- 
tion is rejection or exclusion ” (Novum Organon, 1620, П, Aph. 16; cf. Sigwart, Logic, II, p. 296). 
Jevons, like most nineteenth century writers, greatly underrated Bacon's early influence. As Reid 
observes, “ the best models of inductive reasoning, such as the 3rd book of the Principia and the 
Opticks of Newton, were drawn from Bacon’s rules" 1; Newton’s very phrase ° experimentum 
crucis’ is borrowed from the Novum Organon (П, Aph. 36). Hume, in the introduction to his Treatise, 
describes the application of the ‘ experimental method of reasoning,’ first to simpler sciences, and 
then ‘ at the distance of a century ' to * the science of man,’ as having originated in the new principles 
laid down by ‘ my Lord Bacon’; and he himself constantly uses the disjunctive type of reasoning. 
The psychologist will recall that Bacon announced that his inductive method was to be applied 
“non tantum ad naturales, sed ad omnes scientias,’ expressly including psychology and social or 
political theory (1, Aph. 127). Finally, nearly all the eminent scientists associated with the early 
work of the Royal Society—Boyle, Hooke, Newton, Wren, and Wilkins, together with Petty and 
Grant, the founders of * Political Arithmetic '—acknowledge the lead of the Artium Instaurator ® : 
and a contemporary ballad describes how 


The Prime Virtuosi have undertaken 
Through all the experiments to run 

Of that learned Man, Sir Francis Bacon, 
Shewing what can, what can't, be done. 


For another hundred years a heated conflict dragged on between tRe English inductive school, which 
claimed Bacon as its founder, and the French deductive school, deriving from Descartes. But by 
the middle of the 18th century the inductive method had won ; and Bacon is restored to his pedestal 
by D'Alembert and the En-yclopédistes. 

Yet once again the formulation of the methods proper to the inductive sciences cannot be 
credited to any single thinker. It can only be educed from an historical and critical review of such 
methods as they have actually been worked out and practised by scientists themselves. Nor is there 
as yet any final or satisfactory formulation to which the novice can be referred. As Russell has 
remarked, “ the inductive method is the glory of science, but the scandal of philosophy.” 

In conclusion, may I add that I readily agree with Dr. Richardson that there is still something 
that the psychologist may learn from Newton’s methodology in dealing with the problems of motion, 
provided we reinterpret his procedure in the light of the later developments to which Dr. Richardson 
himself alludes ? It has, I think, become clear that the so-called ‘ laws ' are not so much * hypotheses * 
to be established by inductive or deductive procedures, but simply * principles stating how the 
scientist proposes to deal with the phenomena in question : they express, not the conclusions of his 
inferences, but the prerequisites of his inferences. When, in dealing with optical phenomena one 
man chooses as his fundamental concept that of the light-ray, while another chooses that of the 
light-wave, neither is propounding a generalization which we can show to be either true, or false, 
or merely probable : each is announcing a working policy he has determined to adopt.“ The same is 
true when Riemann and later Einstein, discarding * action at a distance,’ decide to substitute a * field 
theory.’ 

Ж recognition of this methodological distinction might, I fancy, go far to ease some of the current 
disputes in p.ychology. In the statistical field there are obvious analogues in the use of factor 
analysis. The fundamental principles preferred by advocates of this or that factorial method are 
not to be regarded as generalizations inferred from observational data, any more than the principle 
of the syllogism is a generalization inferred from empirical phenomena ; they indicate the kind of 


1 Works (1785): p. 200 in Hamilton edition. If a contemporary testimony is desired, I may cite 
Professor Broad's considered estimate : “Тп his analysis of inductive arguments Bacon was, so 
far as I know, breaking new ground ; and all later discussion has followed on his lines * (Selected 
Essays: Ethics and the History of Philosophy, 1952, рр. 141-2). 

2 E.g., Treatise on Human Nature, I, iv, 5 ( On the Immateriality of the Soul YMA WS absurdity ae 
the last two suppositions sufficiently proves the veracity of the first ; nor is there any fourth opinion." 


? Bacon, so named, figures in the frontispiece designed by Evelyn for Sprat’s History of the Royal 


Society (1667). The ballad (British Museum) is attributed to J. Glanvill. 
4 This view has been urged (in somewhat paradoxical terms) by Е. Р. Ramsey and М. Schlick : cf, 


: also Popper, K. R., Logik der Forschung. 


p* 111 


Notes and Correspondence 


implification each proposes to adopt in constructing a workable map or model of the phenomena 
Hels trying to E understand. The choice is rather like that of the maker of geographical 
maps or globes : according to his purpose he may seek to locate positions on the earth in terms ofa 
three-dimensional sphere, of a curved space of two dimensions, or of an orthographic, stereographic, 
or Mercator projection. And it would be plainly misleading to treat such devices as general proposi- 
tions that must be formally proved or disproved by some process of inductive reasoning. : 
Having decided оп the general line of approach and defined the working concepts it entails, we 
then can proceed to set up and test alternative ‘ hypotheses ° (in the narrower sense of the term) ; 
and here I think I am in almost complete agreement with what has been said above about the merits 
of the *hypothetico-disjunctive " procedure. For indications as to how precisely it should be 
employed in any particular research, the reader may be safely referred to Fisher's imaginary * psycho- 
physical experiment ’ as described at the beginning of The Design of Experiments (1937, pp. 13-29), or, 
for more detailed working instructions, to the section dealing with * Statistical Hypotheses and their 
Verification ' in the recent volume on Statistical Inference, by Н. M. Walker and J. Lev (1953, pp. 5f.)- 


C Bs: 


THE RELATIVE MERITS OF RANKS AND PAIRED COMPARISONS 
To the Editor, British Journal of Statistical Psychology 


Sir,—Recent articles in this Journal (Durbin, IV, рр. 85-90; Whitfield, VI, рр. 35-40) suggest 
that the ranking-methods which were introduced into psychology 50 years ago by Spearman and 
which were so seyerely criticized by contemporary psychologists are regaining their old popularity. 
They are unquestionably attractive to the practical worker; but he would like to be assured that the 
former criticisms have been adequately met. Titchener and his followers (e.g., Myers and Valentine 
in this country) maintained that paired comparison was far superior as an experimental technique, 
since it enabled the investigator to discover preferences, etc., that were psychologically compatible, 
though logically inconsistent : Brown, in The Essentials of Mental Measurement, wrote still more 
emphatically: “ the use of ranks as a measure of a characteristic is inadmissible . , . Some, form of 
frequency distribution must be assumed ; the method of ranks assumes it to be a rectangle. The 
ese Jon Cono nary be more improbable.” 

ODi: itfield and others, who apparently favour the method, appear to disregard these various 
objections. Moreover, Dr. Whitfield’s interesting article in the last Эг of this onrad raises several 
additional questions. He shows that the measurements derived from summing m rankings tend, 
when m is large, to produce a normal distribution. Does not this imply that, if (as Dr. Brown 
observes) “some form of frequency distribution must be assumed," it would be safer to assume 
that the distributions are normal ? And how are we to calculate the correlation between such a 
загада en Denn criterion which itself may be given in the form of ranks ? Should we re-rank 


ubstituting ordinal numbers for cardinal, or s y iterion 
ranks to normal deviates by using Hull’ : Yo rear convert шеси 


statistics ? 5 table, as described in so many text-books of psychological 


n 1 al problem still leaves many steps somewhat obscure. 
He mentions two alternative procedures available for testing the statistical RES of a measure- 
Ho first, E avon Py Professor Kendall, and secondly, 
role : Че somewhat curtly declares that “ Kendall's treatment 
of the problem of m ranks is inappropriate.” nt he offers no reasons for rejecting this solution 
y Burt aving derived a formula for the frequencies to be 
expected by chance, but again no proof is given. Since the new solution is of stu interest to 
rofessor Burt or Dr. Whitfield insert the missing steps, and 

propriate and how the other is deduced ?— 


_ Reply.—1. The main objections to the techni 
logists as by statisticians, notably by Karl 
PU Kendall, who have restor 
stating that ranks do not form a di B 5 h 
Ranking is rather to be regarded as a special type of neasure * of the quality to which they refer. 
classes can be arranged in definite order, 


E he same rank ; i 1 i 
forced on the ranker, then his ranks can (if i ; and, if the appropriate number of ties is 
i dE Batas GANAN аи We wish) be made to yield any type of distribution, includ- 


urposes an individual's rank-number 
1 Cf. Pearson, K., Further Methods of Determinin, 


Memon o Bienen Sai IV. 1907). doen ing Correlation (Drapers Company Research 
summary of Pearson's. Incidentally, the fet 


112 


Notes and Correspondence 


(or rather his rank-number minus one) can best be treated, not as an ordinal number, but as a cardinal 
number : it states the number of individuals in the classes prior to his Although from a practical 
standpoint the distinction may seem somewhat academic, it is nevertheless important for the rigorous 
deduction of certain formule. Thus, with this interpretation of rank-numbers, no assumption 
need be made about the form of distribution (as Brown erroneously supposed). If so, it follows 
that, in addition to ease and speed of working, the use of ranks has a special theoretical advantage 
of its own : it provides a convenient non-parametric procedure, i.e., a procedure which сап be used 
when we desire to avoid any specific assumption either about the distribution of the variables or 
about the scale of measurement. In particular, the analysis of variance takes a very simple form 
when the data are expressed in terms of ranks.* 

2. Let me now deal with the most fundamental of the difficulties raised by Mr. Morris. In 
his last paragraph he enquires why the * solution * derived by Dr. Whitfield and myself was preferred 
to that proposed by Professor Kendall. The explanation is that the two solutions are really solutions 
to two different problems, namely, the treatment of summed ranks (a) for individuals and (6) for groups. 

The difference can be explained more precisely if we take a concrete case. Let us suppose that 
an education authority has adopted the * quota system * of allocating children to a grammar school : 
as a result, School A is allowed (let us say) two pupils, and the teachers are required to name the two 
brightest individuals out of a batch of eight who happen to have reached the appropriate age. The 
headmaster accordingly suggests that the four teachers who know all the pupils should each rank 
them in order of merit, and that the two boys with the highest average rank should be selected. 


TABLE I. TEACHERS’ RANKS FOR EIGHT BOYS 


тоқа Teachers Total for T 
Boy Examination GE Teachers Average 
T.S. 1 2 2 1 3 8 2 
B.R. 2 1 1 4 2 7 1} 
О.Е. 3 4 6 8 4 22 5k 
R.L. 4 5 4 7 5 21 54 
Е.В. 5 8 5 6 1 20 5 
M.W. 6 7) 3 5 8 23 5i 
G.A. 7 5 79929 52 А. 22 5% 
J.H. 8 3 8 4 6 21 5i 
about 
Sumo | 420042 042 2 300 182 
Squares of (b) about 
Deviation boys’ mean 183 19% 27] 27+ = Жж 
up UU 
93 


i i i 5 1 and average 
Table I shows the orders actually given by the teachers in a case of this sort the total 
rank for each boy are shown on the right. If we could trust the rankings, then B.R. and TS wona 
seem to be the brightest of the eight. It is the statistical investigator’s task to determine whether the 


re sted. d » қ 

аа Î psychologist is interested in the trustworthiness o such Tanking, obtained 
in relation to the group as a whole ; the pracne [p Б авео ony in the trustworthiness 

wo top boys. ie questions are 5 4 

i t агар st T natis the validity of the rankings for the whole group. To Setane 
this, the natural approach will be to carry out a factorial analysis : in DC we S ascori er 
whether there is any general component common to all four rankings, and, Ш SO, Ж ict e is 
statistically significant. The correlations between the rankings in Table I, together wi е satura- 


tions deduced from them, are shown in Table II. 


5 t $ introduced by Galton) was originally used to designate position as 
defined by, quedes Yori ie and distinguished from * rank, which is reserved for the ordinal 
number. Thus Brown, following Pearson, states that “ the grade of an individual is measured by 


indivi ove him. ‘ i 
dut reader may refer E. IRAM DE VER S Md ihe 
i TUR -. Statist. с., , 1937, рр. 675-701, Kruskal, W. Н. 
со ширап Of BO Use Amet: s in One-criterion Variance Analysis, ibid, XLVI, 1952, 
BD uelis Уо Баай М., ‘A Comparison of NON Tests of Significance for the 
Problem of m Rankings,’ Ann. Math. Stats., ХІ, сла 


113 
рі“ 


Notes and Correspondence 


TABLE П. CORRELATIONS BETWEEN TEACHERS’ RANKINGS 


OE DL LIGA 2A VE eee 


Teachers i ii iii iv Total 
i 1-0000 "3333 4048 2619 2-0000 
i 3333 1:0000 :2381 :3810 1-9524 
iii :4048 :2381 1:0000 — “0477 1-5952 
iv 2619 "3810 — “0477 1:0000 1۰5952 
Total " 2:0000 1:9524 1:5952 1:5952 71428 — 2:6726* 
Saturations .. -7483 :7305 :5969 :5969 2:6726 


To test the significance of the general factor we may follow the procedure outlined in Factors of 
the Mind (ch. X). The average of the observed correlations (i.e., the average of the coefficients in 
Table I when the self-correlations are excluded) is given (op. cit., p. 275) by the equation 


PT na сенші (3*1)... 


1 Ы E 
= 15 × 31430 = 2619. 


In order to decide whether or not this figure is statistically significant, it is convenient first to calculate 
a quantity which I denoted by the symbol 72. This is given (pp. 275-6) by the equations 


mn? —1 


rper 
2_(m—1)F+1 
НЕ т 
= Шум ЕО т) O 


= dex 71428 = 4464, 
where Era includes the m self-correlations (all taken = 1) as well as the observed inter-correlations, 


2 
It follows that 7? = Cry)" (p.276). This quantity is the square of the correlation ratio, and could, 
if we preferred, be calculated b; 


i y the more usual formula direct from the original data. We have 
in fact (p. 274) z 
Оһ an 
3 = Mm 
n= 0, xe my 


(where Qm denotes the sum of the squares of the deviations about the means, each squared devia- 
tion being taken m times, and Q denotes the sum of deviations of each individual mark or rank 
about the mean for each teacher) 

амла 15 — 

Е саттар IRS 4464, as before, 


Alternatively, if we write Qr for the sums of the squares of the residual deviations (93), so that Om 
and О, are the * between groups ' and * within groups ’ square-sums, then 
Qi = Qm + Or = 75 + 93 = 168; 
and we thus have an alternative method for determining the denominator. 
To test the significance of the factor (p. 274) we can now 
the significance of a correlation ratio. Accordingly, 


use the procedure adopted for testing 
we shall take 


Fa Yn _ (Q0. Qnin- — 
V. Or + (my, 0, = (п = 0 Gm = 1) 
(where Vm and V, denote the variance ‘ between gr 


É oups * and * withi 'oups ’ ively, and 
(n); and (nm); denote the appropriate degrees of freedom) aa 


S1 9e P : 
zx ا ا ر‎ EON 
75 


-3х 93 ^ 3 X “8064 = 2-4192. 
114 


Notes and Correspondence 


On referring to Snedecor's table for the distribution of F, we find that this figure is somewhat below 
the value given for F at the 5 per cent. level. We are therefore forced to conclude that the factor is 
not Statistically significant. 

With a larger table we could use a chi-squared test. m(n — 1)7? тау be regarded as the sum 
of the squares of a number of standardized deviations ; and, as m increases, the distribution of this 
value tends to that of chi squared. Let us therefore calculate 


x? = m(n — 1): pne (51 = 1,295. т) (у) 


= 1-250. 
Entering the chi-squared table with n — 1 degrees of freedom, we find once again that this value is 
definitely below that required for the 5 per cent. level. 
There are two convenient working formule that often help to shorten the calculations. To avoid 


calculating the deviations of the totals (which often introduce fractions), it is quicker to square the 
sums as they stand, and then correct for the mean. If s; denotes the total of the rank allotted to the 


ith boy, then : 


2 TEE Е " 

"T = aI (nnn + 1) 3+1) +++ (WD 

1 125s? 3 à; ] y 

and = nl Ts р 5т--1)(0--1)-2) ... (vii) 


The general procedure outlined above is very similar to that suggested by Kendall 1 in dealing 
with “ the case of m rankings.” Не proposes to calculate what he terms a ‘ coefficient of concord- 


ance,’ expressed by the equation 

Sum of squares of the deviations of the total ranks about their means 
The maximum value possible for such a sum 

In my book (loc. cit., р. 275) I pointed out that бі, defined as above, is also the maximum value of 

Om. It follows that Kendall's coefficient 


w= Se — qe, ... (viii) 


Ww 


He also gives eqn. ii and iv (with W written for 72) ; notes that ғ is equivalent to the intra-class- 
correlation ; and proposes to use Fisher's z-test to test the significance of the variance-ratios. This, 
of course, is equivalent to the F-test. It would seem therefore that the two methods of dealing with 
this problem, reached from two somewhat different standpoints, are really identical. 


(b) The issue which confronts those who have to decide whether or not this pupil or that deserves 
a scholarship is quite distinct. They are concerned, not with the statistical significance attaching 
to the figures for the group as a whole, but with the statistical significance of the figures for some 
particular individual. Yn the present example we are required, if possible, to pick out two boys, 
and two boys only, for a possible scholarship award. According to the summed ranks, B.R. is 
the best and T.S. nearly as good ; but, if we are to justify this choice, we must be able to show that 
the difference between the figure for T.S. and the average for the whole group is statistically significant. 
Whether or not the detailed ratings of the other six boys are statistically trustworthy no longer 


matters. 
Consider, then, the total mark obtained by the boy T.S. (who is second in order of merit), 
namely,8. This is а deviation of — 10 from the average mark, namely, 18. Could such a deviation 


have arisen by chance ? - 

To decide this question, let us suppose that the teachers have allotted their ranks to T.S. entirely 
at random. This can best be done by adopting some mechanical method of randomization. We 
may accordingly imagine that each teacher uses an octagonal teetotum, with its sides labelled from 
1to 8; and, having spun it once, takes whatever number comes uppermost to represent T.S.'s rank. 
Тһе four random ranks are then summed. Is it likely that a value so high as 8 would thus be secured 
by sheer chance? In order to obtain an answer, we require first of all to determine the frequency- 
distributions of such sums when obtained by purely haphazard assignment. 


1 Kendall, M. G., Advanced Theory of Statistics, I (1943), p. 411; cf. Rank Correlation Methods 
(1948), ch. VI, and refs. The procedure in my book was intended to be of general application. 
Kendall has dealt specifically with data in the form of ranks ; and, in addition to the more systematic 
mathematical investigation of this problem, has shown that the procedure may also be used in the 


case where the ranks are tied. 


115 


Notes and Correspondence 


2 is enquires how I derived the formula for such frequencies. “Тһе problem is ап ancient 
one sq or games of chance (see Todhunter, History of Probability, 1865, art. 987) : it is 
simply a generalization of the more familiar problem of tossing a coin whose two sides yield two 
alternative values only. As is well known, when there are only two alternatives, the distribution is 
given by the binomial theorem ; when mee a more than two, the same reasoning shows that the 

istribution will be given by the multinomial theorem. 

G Let the т alte natives (i.e., the number of sides to the teetotum) be л; let the number 

of teetotums spun (i.e., the number of teachers judging) Бе т ; and let s be the sum of the m * ranks 

Obtained as a result of the m spins. к 
Since апу опе of the л faces may turn uppermost on each of the m teetotums, the number o: 

different ways in which they may fall after spinning will be п". To determine the number of ways 

in which the sum may be obtained, the simplest procedure is to use the device of a generating 

function," ! We find that the number required will be equal to the coefficient of # in the expansion of 


@+e+ e+... p m 
= Фа... т)" 
= ml — m (1 — f, Bo - (ІХ) 


The coefficient of { in this expansion will be the coefficient of и" іп the expansion of 


a — 0)" (1 о)". 


= - E7 
Now (1 — m) = 1— mma п D en mm pm 2) Uu -- loss 
а.” г mon + 1), min + 1) (m + 2) 
and i) m + 13 t 4 123. ТЫ 


Multiply the two series together and pick out the coefficient of гет in the product. This is 
easily found to be 


mí(m--1)...(s—1) m(m-4-1)...(s—n—1) 
(s — m)! (s—n—m)! 
mim — 1) ,m(m-41).. . (5 = 2n — 1) 
* 1.2 (s — 2n — m)! vas 3 2869) 


the series being continued so long as no negative factors appear in the prod bilit; 
of obtaining s is then obtained by dividing реа eee ата BEODADILEY, 


е 15 efficient for n = m — 1, 
; and dividing this by п” = 4096, we obtain P = 0:01709 
e igh for us to assume that the distribution of the sum values was 
approximately normal, we could argue as follows. The mean of the summed ranks is 37 ( + 1) = 18 
(as may be verified from the last column but one in Table I) 
the absolute value of the deviation (d) will be 


| sum of ranks for boy — im (n — 1) | 
(where the subtraction of + is the correction ге: 
d/s = 2:07, which gives P = 0:019. Thus the attempt to dis 
statistically significant, although the attempt to díscrimi i 
Was not statistically significant. With this particular group the explanation of the divergence is 
simple : as the headmaster put it in discussing the Tesults, “ except for the two lads T.S. and B.R., 
there is little to choose between the various members of this class, since the duller pupils of that age 
were left in the class below,” 


Mr. Morris enquires about the 


they are 1, 4, 10, 20, 35. The sum is 70 


; the variance is c? -+ ЕРЛІ (п—1)= 21; 


1-|8-1І8|-і-о% 


a Morris 1 а Procedure to be followed when we have “an independent 
criterion which is itself given in the form of ranks.” 


1 As it happens, all the eight boys in the group 
described above eventually sat for the final examination for junior county scholarships ; and a mark- 
list giving all the candidates in order of merit was circulated to the board of examiners (of whom, 
as L.C.C, byschologist, I formed one). Accordingly in Table I, I have prefixed a column showing 


1 Cf. Aitken, A. C., Statistical Mathematics (1939), 
116 


рр. 16f. 


yA 


Notes and Correspondence 


the ranking of these boys in this later examination ; and we may regar i ST 
criterion ” of the kind he has in mind. BE IS Se Isa 
Let us therefore suppose that we wish (as Mr. Morris suggests) to calculate some ki i 
correlation peters (а) the composite тео the school teachers’ judgements. EGRULS 
unweighted sum of their several ranks, an е results of the final examination 
order of merit. How are we to proceed ? aon ep азап 
Obviously, to convert the sums to ranks, ог the ranks to normal deviates, wor 
correlation. The simplest method is to take the unweighted sums as they stand. : Lee oe 
denote the sum of the teachers’ ranks, and с: the criterion rank ; then the ordinary product-moment 
formula will yield the following equation : 
43 азн — mn (n + 1)? 


Tm = 
4 4 n(n? — 1) [AX s?— т?п (п + 1)?] } 


E.g., with the data given in Table I, we have 
қ 4x 72—4x8x 81 
Fm l 1 
11 x 8 x 63 [4 x 2892 — 16 x 8x 80] 
) 
= “748. 

This formula provides a quick and useful way of reaching an approximate estima г 
multiple correlation with a criterion = (it is actually its lower limit). To find E AE E 
have to weight the rankings furnished by each teacher ; and these weights would be computed from 
the correlations and the inter-correlations in the usual way. 

To ascertain the improvement that has been effected by basing the correlati 
rankings simultaneously, instead of on a single ranking only, we can calculate the MS me m 
correlations that would be obtained by taking each teacher's ranking singly. There is no need 
to calculate the correlations explicitly. The average required can be obtained direct by means of 


the following formula : 


-.. (Xl) 


3 | 43 asi 
n — 1 (mn (i + 1) 


E.g., with the figures given above we have 
mcd] 4x7 
7 (4х8 
= “500. 


In this case by taking four teachers" judgements instead of only one we i 3 
* coefficient of prediction,’ as it has sometimes been called, by caro 50 per panne шпргоуе! me 
the accuracy of the foregoing formula by calculating the four correlations separately, and then 
finding their average in the usual way. We obtain > 
4 (524 + -762 + -071 + :643) = -500. 

of the average rank correlation approaches a normal distribution more quickly 
n, we may, as a rough test, take the standard error to be 

| 2-1 

i NI =1) 28 0:189. 
The value found is therefore statistically significant. 


On re-writing the equation for the multiple correlation (xi) in a form analogous to tha 
r the average correlation (xii) it will be seen that the numerator is the same in both ed 


differs by the substitution of 


Tav 


a+}. . . . (sii) 


Since the distribution g 
than that of a single rank correlatio: 


equation for 
The denominator 
ү үш, [4E s* — тіп (n+ D} for the factor ү t mn (n? — D} c 
On comparing this with eqn. vi and viii we note that the increase of the multiple correlati y 
the average correlation depends on the ratio of the standard deviation of the siis of the tank fo 
the standard deviation of a single m of SUD * 
4. Mr. Morris also enquires about the alleged superiority of the method of paired compari 

Carried out in the manner standardized by Titchener,? the procedure is REIS RE Mere 


r converting ranks into normal deviates I suggest using Table ХХ of Fisher and " E 
Tables rather than Hull’s table, as suggested by Mr. Morris. It may be noted pe Eid 
between a set of ranks and the corresponding normal deviates is quite high : with a large sample it 
is V3/7= 098: (а proof is given in the Notes previously cited). 

2 Experimental Psychology.Y. Qualitative.ii. Student's Manual (1901), Experiment xxi, рр. 92f. 


1 Fo 


117 


Notes and Correspondence 


In the diagonal I have placed (in brackets) 


, not half that number (as Guilford does), because 
no differences could occur between the judges about the same boy. 


TABLE ПІ. TEACHERS’ JUDGEMENTS EXPRESSED BY PAIRED COMPARISON 


Number of times boy named at top of a column was ranked above boy named at left. 


Boys ЖӘИЕ OF RL ЕВ MW. GA. ІН Total 
TS. (4) 3 0 0 1 0 0 0 8 
B.R. 1 (4) 0 0 l 0 1 0 7 
O.F. 4 4 (4) 2 3 2 1 2 22 
R.L. 4 4 2 (4) 2 2 1 2 21 
E.B. 3 3 1 2 (4) 3 2 2 20 
M.W. 4 4 2 2 1 (4) 3 3 23 
G.A. 4 3 3 3 2 1 (4) 2 22 
J.H. 4 4 2 2 2 1 2 (4) 21 
Total ЖЕНЕ авт —— 


5% 13 14 15 144 


ta of this kind a suitable method of statistical : р 
(ор. cit., ch. ХІ). atistical analysis has been suggested by Kendall 


ms or the judge is unintentionally departing from 
po In that case it would be better to correct the inconsistent judge during the process 
of ju ping (allowing ties in case of doubt) rather than to look for inconsistencies afterwards. 
) To measure the amount of agreement, Kendall Proposes a coefficient which he denotes by 
и. Slightly simplifying his working formula, we may write : 
25 1 Хей 2 (m= p — У p?) T 
< oo 5 т (т — 1) x $n(n— Da ا‎ 
where p denotes the frequencies entered 


example given above : 


But (as Kendall points out) u is i 
correlation between two sets of ranks 


- ^, however, is always lower than the 
f rank correlation (0). Hence, to those who are more familiar 
ordi › the value computed from eqn. (хіў may be some- 
what uninformative, if not actually misleading, Р ds 


ar boy, say J.H., the last in the list. According to Table I, Teacher i 
tine b Оен. ; Similarly 

› tee boys ; and Teacher іу five boys. The total of such preferences 
Re Dou pe ee зы 4 y uch preferen 


U tl ; f : 
Similarly for all the other eae by adding 4, we shall obtain the total enter 


5 
118 


: А Notes апа Correspondence 


It is clear, therefore, that we can use the totals obtained by paired comparison in the same way 
as we used the totals obtained from the ranks, to calculate the correlation ratio ; and from that 
we may deduce, as before, the average correlation. This method of treating the data has the 
advantage of keeping the constants so calculated in close relation to the results obtained from 
ordinary correlational techniques and in particular from a factorial analysis. Hence, except when 
the method of paired comparison reveals inconsistencies, there would seem to be no advantage in 


using the coefficient u.—CyYRIL BURT. 


THE IDENTIFICATION AND INVARIANCE OF FACTORS 
By T. LEYDEN 


In several papers in this Journal (e.g., IT, pp. 134f.) various methods have been recommended for 
deciding how far a factor found in one research may be justifiably identified with that found in 
another research. The earliest suggestions appear to be those put forward by Burt, who attempted 
to show that the educational abilities discovered in his later surveys were the same as those reported 
in the earlier (cf., e.g., J. Educ. Psychol., YX, pp. 68f., and refs.). He suggested (a) the correlation between 
factor-measurements when the samples are the same, (b) the ” unadjusted correlation * between factor- 
“saturations when the samples are different, (c) a ‘symmetry criterion’ as an over-all test. More 
recently Fiske (Amer. Psychol., III, pp. 360f.) and others have proposed (d) the ordinary (* adjusted °) 
correlation between saturations, which has proved extremely popular. Cattell has described (e) a 
* symmetrical marker criterion, and (f) a * coefficient of pattern similarity " derived from the formula 
for chi-squared (cf. this Journal, П, pp. 136-9, and refs.). In an investigation on which I am engaged 
it has been surprising to find how divergent, and even contradictory, the verdicts pronounced by 
these various methods frequently are, especially in the case of the first factor. Many educationists 
besides myself would, I know, be grateful for a fuller examination of this problem. 

To check my own observations, I secured nine sets of test-data, each relating to the same group 
of tests (or nearly so) applied to the same group of pupils at intervals of one to three years. In three 
the tests were mixed batteries of intelligence-tests ; in six they were educational. Since the tests were 
not carried out with this particular problem in view, the results deserve only the briefest report. І 
hope to discuss the details more fully in a later contribution based on a more systematic enquiry 
planned ad hoc. | 

To compare the results obtained I have taken the correlation between factor-measurements as 
giving the ideal figure, and have expressed the coefficient obtained from the other procedures as a 


percentage of this figure. 


TABLE I. COMPARISON OF CRITERIA 


Criterion Average 
Percentage 
Je Lace ee лт ығы. ал е. засы 
1. Correlation between factor-measurements .. E У м. ына P 100-0 
2. Correlation of factor-measurements with external criterion (teachers' assessments) 87-3 
3. Unadjusted correlation between saturations. . nis E 3 "» 91:6 
4. Adjusted correlation between saturations .. id M R L4. 511 14-5 
5. Marker criterion s M. v у: x T kh 2. с le 1521 
6. Coefficient of pattern similarity 25 e. E = 4i = AA | 61:4 


s above relate only to the first or general factor as extracted by simple summation or the 
centroid method. With the corresponding * basic ’ factor, as extracted by a group factor analysis, 
the proportions are much the same, except in three cases where there are large supplementary factors : 
in such cases, not only the relative size and order of the saturations, but also the size of the factor- 
variances, may appreciably change in the second research. In one case, for example, it appeared 

iance of the general factor obtained by the centroid method 


that, after an interval of two years, the vari : 
had apparently increased from 51 to 54 per cent., whereas the variance of the basic factor had dropped 
from 46 to 38 per cent. Thus the use of the centroid procedure instead of a group factor analysis 
may at times obscure the well-known tendency towards increased specialization of abilities with 
increasing years. 

The main conclusion seems clear. With the general factor the popular method of correlating 
the two sets of correlations may yield a most misleading conclusion : (with bipolar factors, of course, 
the two coefficients are bound to be much the same). On the whole, the safest method seems clearly 


to be that based on the ' unadjusted correlation.’ 


The figure 


119 


z 245 ire 
Vol. u The British Journal of Statistical Psychology Novem 
Part 


BOOK REVIEWS 


5 3 gr 
Factor Analysis. By Raymond B. Cattell. New York: Harper & Bros. Pp. xiv + 462. £2 8s. 


is divided into three 
е d Cattell’s latest book has а threefold purpose, and is divi in 
E pees Part I seeks “ to meet the need of the general student in science дагаа í 
is intended to enable the volume to be used as “a text-book for statistics courses. аг 5 


more advanced ; and, with this further addition, the author hopes that the volume may also serve 
as a “handbook for the research worker.” 


ight be reached. Further, “ a clear 
5 Maxwell Garnett in 1919.” Here, as Dr. Cattell 
incidentally observes, it is “ an instructive commen 


’ that Garnett’s 
de his discovery." It 

not “ generally recognized ” even now ; and even 
Dr. Cattell seems to have forgotten that Dr. Garn 


ў ett was Karl Pearson’s assistant and explicitly 
attributes the essential principles to him. Of the various techniques available, we are told, “ probably 


a ggested for the purpose, and was 
paper published by Karl Pearson as 


ith several hypo- 
E research. Professor Cattell therefore lays special stress at the 
outset on the need for the psychologic: 


ntific method * 


d th led “h 1 pation bestowed 
^ A ad, on the so-called ** othetico-deductive - 
dure." The logical methods by which science s ур! Ive proce: 


eeks to establish empirical relations, he says, are 
con and Mill." Indeed, in his own view, factor 


1 the necessity to elaborate rigid h Otheses : it^ 
* (his italics), though “ of course ” Y 
the factorist always enters with some hı 


(as he says elsewhere) 
; ) even when he seems to enter with none." 
It is partly the desire for freer exploration and partly the belief that the 
of Pearson do not yield the * most meaningful factor 


‘In Part II, curiously enough, the“ inventor ’ o 
such claim and whose version has 


rown—plainly had Pearson's techni 
lso have endorsed Cattell's critici nique 
procedure has not proved practicable or appropriate in psychology WR Тһе fact that Pearson's 


х ology without considerable modificati 
ook that it was after all Pearson $ odification 
the first multifactor technique. i Who first saw the problem and produced 


120 


Book Reviews 


than a fraction of the total personality, he himself is convinced that the most meaningful scheme 
will be a ‘simple structure’ of group factors: such a structure, he considers, will also show most 
invariance. 

How far a factor is invariant must depend, as he rightly points out, on many more conditions 
than are commonly assumed. Hitherto invariance has usually been interpreted to mean stability of 
factors, when fests are varied. But the psychologist’s initial measurements, like the physicist's. 
imply variation in three distinct * dimensions.’ As Stern pointed out long ago, they consist of assess- 
menis for certain attributes obtained from certain individuals at certain times. Hence it would seem 
possible, a priori, to correlate and factorize either attributes, or times, or individuals. And in some 
of the earliest British investigations we find profiles derived to describe individual children, and 
factorizations of test-results obtained on successive days of the week or at different stages of the 
child's growth. Spearman and Thurstone, on the other hand, seldom mention either the individuals 
as such or the changes they exhibit with the passage of time ; for the most part they and their 
followers treat factor analysis as concerned primarily with attributes, and chiefly with cognitive 
attributes. Professor Cattell agrees with those who adopt a wider standpoint. He ingeniously elabor- 
ates the box form of Stern's three-dimensional chart, and uses it to provide a somewhat elaborate 
symbolism to distinguish the various conceivable approaches. Thus “ R1B-technique is the typical 
procedure ; Q2B-technique is Burt’s advocated use (of correlating persons) ; 015 is Stephenson’s ; 
and PIB the analysis of the single individual." As Cattell remarks, what might seem at first a complex 
and confusing situation was greatly simplified by Burt's demonstration of the * reciprocity principle." 
Since in theory the matrix factorized by one method is simply the * transpose ' of the matrix factorized 
by the other, the factors obtained by one technique must in theory be the same (except for those 
eliminated by preliminary averaging) as the factors obtained by the other: e.g., to take Cattell's 
illustration, “ if correlation of abilities yields a factor of mechanical aptitude, then correlation of 
persons will yield a factor or type consisting of mechanically apt persons," and so оп. The theorem 
is really a special instance of Poncelet's * duality principle,’ which has proved so fruitful in projective 
geometry. This conclusion, although at first vigorously attacked, is, so Cattell considers, fully con- 
firmed by the recent work of Madow. It would seem to follow, as he says, that the views still held 
by Stephenson about the independence of Q- and R-technique cannot be accepted. In practice of 
course there will be many minor differences arising from the different modes of approach ; but 
Cattell’s final conclusion is that ** R-technique provides the binding frame of reference.” 

Part II deals with * Specific Aims and Working Methods.’ Professor Cattell begins by classifying 
the various working procedures available. ** Of the possible classifications according to the algebraic 
process, the division into the methods of simple summation and weighted summation is one which is 
fundamental for mathematicians and also important for the computer.” But there is a further dis- 
tinction, no less essential to an understanding of the conclusions reached, between what he prefers 
to call * bipolar solutions" (i.c., summational and ‘centroid’ factors) and * bifactor (i.e., general 
and group factor) solutions.' * The former, he agrees, might be regarded as “ difference factors with 
respect to the latter " ; but he doubts whether this is always true. On the whole he considers that 
* the most likely configuration is one of overlapping group factors with any combination of signs 
that may be required,” in short, a simple multifactor structure which may have negative as well as 
positive loadings, ^ The real objection,” he says, “ to Burt's argument for * bifactor ' methods is 
that the factors obtained are not the same in meaning as those obtained from simple structure centroid 
analysis " : even if the ‘ primary abilities” reached by Thurstone—the verbal, spatial, numerical, 
and other group factors—appear to be similar to those previously found by Burt, there is nevertheless 
** one factor in the bifactor series, namely, the first general factor, which is never matched by anything 
in the multifactor series " (his italics). But surely this argument will cut both ways. A critic might 
equally well maintain that the oblique factors obtained in a * simple structure’ are " not the same 
in meaning " as those obtained by an analysis into general and group factors, and that they omit 


1 This use of terms is likely to be a little confusing to the beginner. The terms * bipolar,” * bifactor, 
and ‘ centroid ” are so closely associated with the methods of the writers who introduced them that 
it would seem safer to allow each a kind of copyright for his own specific terms. On p. 133 Cattell 
appears to suggest that the scheme of general-plus-group factors was a later modification of * the 
process called bifactor analysis by Holzinger,” and was “ conceived as a direct extension of Spear- 
man’s two factor solution.” Spearman, however, explicitly states that Burt's method was “ chron- 
ologically earlier than the bifactor method ” (Human Ability, p. 28) ; and surely the phrase ° bifactor 
method ’ is needed to describe the particular working method devised by Holzinger and his co-workers. 
The resulting factor-matrices may appear the same in their general structure, but in their details the 
factors and their specifications may differ appreciably. For example, one of Cattell’s objections to 
the * bifactor solution ' is that the results are * not invariant 7; this may be true of the results reached 
by Holzinger's procedure for the reasons he gives ; but it does лог appear to be true of the scheme 
of factors reached by the so-called * group factor method. Indeed, owing to the objectivity of the 
latter procedure, the results are much more likely to be invariant than those of simple structure, owing 
to the wide choice of “ alternative sets ” to which Cattell himself alludes. 


121 


Book Reviews 


the factor which, as a rule, accounts for quite as much of the variance as all the group factors put 
together. It seems to have been a belated recognition of this point that led Thurstone to suggest 


re-factorizing the correlated factors, and so securing a ‘ second order’ matrix, containing a general 
factor as well as group factors. 


Cattell's own view of simple structure, it will be noticed, departs in one important respect from 
that of Thurstone ; he does not insist that it should contain Positive saturations only. “ The myth 
that a positive manifold must be maintained,” he says, “ we have already dismissed, since almost 
any factor is as capable of interfering with a performance as of aiding it.” He mentions further 
difficulties attending the search for simple Structure, “ The discovery that one and the same 
correlation matrix may yield more than one simple structure solution, and therefore alternative sets 
of factors, presents a disturbing problem.” To secure a decisive result “а common device among 
the unsophisticated is to rotate for psychological meaning, i.e., to move a factor to a position where 
the loadings agree with some preconception of the experimenter." But, he continues, ‘ almost any 
preconception can be confirmed in this way, and this approach merely perpetuates erroneous 
speculations.” 


To escape from this situation various devices are considered. One obvious solution, as he 
points out, is “ not to rotate at all ! The most systematic defence of this procedure has been made 
by Burt, followed by Eysenck.” But is this a fair representation of these writers’ views ? Burt, for 
example, as readers of this Journal will be aware, has described both arithmetical and graphical 
procedures for transforming a set of summational (or * bipolar °) factors to a corresponding set of 
general and group factors; and, in order to avoid the subjective influences, like those of which 
Professor Cattell complains, the rotations are based on the dichotomous classification indicated by 
the bipolar factors. Eysenck, in the earliest of the papers cited by Cattell," follows (as he Says) an 
alternative method “ used by Burt and his co-workers nearly twenty years ago" ; this is to“ make graphs 
of the [observed] coefficients and judge by the resemblances between the contours : tests that 
yield parallel contours are then assigned to the same group factor. It is true that there is here no 
explicit rotation : but the purpose of the calculations was to show that, with certain types of correla- 
tional table, what are virtually ‘rotated factors’ can be reached without “ the prolonged and 
admittedly precarious graphical procedure." In his later work rotation has often been employed. 

. , Elsewhere Cattell himself describes what would seem to be an extension of much the same pri à 
ciple—the use of “ parallel proportional profiles," This he would regard as “ the ideal Seri "1 
present in a practicable state," and moreover “ for si 4 


but, he adds, “ it is not at М 
several drawbacks." As a conceivable alternative, he disci * criteri, mple structure it has 


to call it, * criterion rotation ° 3 and adduces “ logi S тастеноп апарар 
or sufficient method.” In particular, he contends. 
spond exactly " to the criterion, it i 
instructive discussion of various ‘ 


“ bringing the rotation process to a Satisfactory 
of simple structure : i i i 


he so-called factor “ corre- 


All Professor Cattell's pages have a racy liveliness that 
m some 98 dum at ue betray signs of haste. Th curities, not only in th 
phrasing, but also in the figures : e.g., the reference o; Ў “ 2 In the 
reference on p. 315 should be ‘ la^ ;othe* ion р. 249 should be * 1 » 
on p. 41, not, as stated, on * page 46 A 


1* Critical Notice of Thurstone's Primary Ment, 
A closer study of this paper, whic! 
scarcely bears out Cattell's own co: 
common factor-variance, nearly 60 e 
only 40 per cent. is covered by the seven g; accounted fo 


122 


Book Reviews 


Inaccuracies and obscurities of this kind, however trivial, are unfortunate in those chapters 
which provide the student with his first introduction to the working procedure. In Table 1 Professor 
Cattell sets out the correlation table, which is to serve throughout the book as his * main example’ ; 
and, since his treatment is fundamental to his general exposition, it is worth while examining it 
in some detail. In ch. ІП һе explains how the student is to extract the first factor by what he 
calls the ‘ centroid method,’ which, he says, was introduced by Thurstone. Yet both the formula 
cited and the procedure actually followed are, in point of fact, those which had been used by British 
investigators long before Thurstone introduced the ‘centroid method’; and, indeed, Thurstone’s 
modifications do not seem to have been adopted. In particular, the method of assessing commun- 
alities 1 and the method of reflecting signs depart a good deal from those which are the distinctive 
features of Thurstone's working procedure. 

Tn the next chapter the student is told how to extract the remaining factors in order of significance 
until no significant figures are left. But here something seems to have gone wrong either with the 
account of the procedure or with the way it has been carried out. First, as a little calculation quickly 
shows, the variance of the second factor is only 0:987, while that of the third is 1:148 ; secondly, in 
this last factor the saturations for variables in the negative half of the preceding factor yield positive 
and negative totals that are far from equal (4-81 and —:65); and finally if we try to reconstruct 
the observed matrix, the fitis by no means satisfactory. These anomalies appear due to various causes. 
(1) To begin with Professor Cattell has not noticed that in two places the communalities he has used 
destroy the Gramian character of the correlation matrix, and hence are bound to lead to inaccurate 
figures. (2) In dealing with what he calls “ the exasperating business of reflecting," his own selection 
of variables for sign-reversal is by no means the best ; and it js this that accounts for the low variance 
of the second factor." (3) He tels the student that, since the observed coefficients are given to no 
more than two decimal places, “all the figures in the computation ” should be rounded off to two 
places: but, if only two figures are retained in the computation, the final saturations are likely to 
show rather large errors ІП the second place. (4) The method of allocating the signs to the saturations 
for factors I and II in Table 8 gives negative values to those tests which (as even a casual inspection of 
the original correlation table shows) must have positive values in the group factor matrix. It is, 
therefore, hardly surprising that Professor Cattell is induced to conclude that his centroid factors 
are devoid of meaning. But of course he might retort that at this stage he is not concerned with 
meanings, but only with obtaining something to rotate. 

In ch. V the student is told * how to rotate for scientific meaning.’ Cattell's Table 9 gives the 
“ orthogonally rotated factor-matrix " reproduced below, and appends new communalities “ obtained 
from the actual loadings py squaring ”; (incidentally, the squ.rea loadings for test 4 do not add up 
to the communality stated, namely, 0-98). The reader is then assured that “ further rotation fails to 
improve on the above,” and that “ the best simple structure has been obtained.” Yet on a later 
page (Table 27) some of the values are improved, without any specific indication of the change. But 
even so, when the captious reader tries to reconstruct the initial correlations he quickly discovers that 
there is still room for a good deal of further improvement. 

To demonstrate these points, let us examine the results obtained from a straightforward applica- 
tion of the method of simple summation. Let us adopt the method of ‘simple summation’ in its 
original form * and then apply an arithmetical rather than a graphical rotation. One iteration only 


1] have found more than one of my colleagues baffled by the attempt to discover how the proposed 
communalities were reached. In the note to Table 1 (p. 41) the student is told that “ the method by 
Which they are ° guessed ? will be explained later " ; but nothing more is said about the matter in this 
chapter. When he reaches ch. X, he is told that “a rouga method—No. 1 below—has already 
been given in chapter П” ( No. 1 below ° consists in taking the highest coefficient in the column, 
but rounding it off to one significant figure only). But this method has not been given in ch ПІ; 
and it is plainly not the method that has been used in that chapter: (in cols. 4 and 5 “ the highest 
coefficient in the column " is -66, but the figure inserted is -90 ; in cols. 7 and 8 “ the highest coeffi- 
cient ” is "72, but the figure inserted is -80 in col. 7 and -60 in col. 8). A later page suggests rearranging 
the variables “in order of their mean rs and considering the general trend instead of just blindly 
inserting the highest. ** These considerations " (it is said) “ lead to the practice suggested by Burt,” 
and usually mean ^ estimating communalities with high mean r’s to be higher than their highest г? 
and those of low mean rstobea little lower." This method is called * No. 2,' and would seem to 
be the method actually adopted in ch. III ; but it is certainly not “ that introduced by Thurstone." 

2 Eyen without rearranging the figures, an inspection of the table of * first residuals ’ seems sufficient 
to indicate the kind of sign pattern required : by far the largest figures are those in the column for 
test 5, and here the positive and negative signs plainly contrast tests 2, 4, and 5 with tests 1, 3, 6, 7, ~ 
and 8. Having decided (presumably from the positive block at the corner) to group 6, 7, and 8 
together, it is extremely difficult to see why Dr. Cattell then groups 5 with these. 

a Cf. Burt, C., Factors of the Mind, pp. 461f., or earlier papers. The working was carried to four decimal 


places. 


123 


Book Reviews 


to reach satisfactory communalities ; and with these it appeared that all the correlations 
eto two could be Completely accounted for by three factors only : the last two required 
a small supplementary factor. The saturations are shown in Table I A. Using a strict un Е еве 
rotation (based on the procedure described in this Journal, Ш, p. 75) we then obtain the rotate ас D 
matrix shown in Table I B. These factor-saturations enable us to reproduce the correlations an y. 
I do not know what may be the source from which Professor Cattell derived his correlation tal 2d 5 
but it is tempting to hazard a guess that these Were probably the figures used to construct it. е 
results of Professor Cattell’s own rotation are inserted in Table I C. It will be seen that as many СЕ 
seven of the saturations printed deviate by between -10 and -15 from what may be regarded as the 


true values : they are deviations which, with a sample of moderate size, might well exceed twice the 
standard error. 


TABLE I, FACTOR-SATURATIONS 


a er ee 


i i B. Group Factors C. Rotated Centroid 
Method A. Simple Summation p Factors (Cattell) 
Test I п ІШ IV О ту; ші Fx Е; 
Vi | 53448 -122 486 000 0 —4 6 0 | —05 —-61 10 
Ve | +396 “341 "217 :000 | 9 4 4 :0 46 —:37 —:04 
Vs 110 —-384 -401 000 | -© —4 4 :0 58 —47 13 
У; -932 337 167 -000 4 6 7 | 75 E57. 38 
V; +528 "688 --384 000 58 9 0 0 85 08 17 
Ve | :4033 -133 —:036 :000 ale ail 0 EU abl “01 14 
У, (682 —:385 —.455 100 ED 0 1 d 12 02 89 
У, | "613 --3422 —:396 —-100 8 0 1 —4 10 :04 80 
| 
Sum of | 
Squares | 2-276 1:148 “986 020 | 1:71 1:51 119 :02 1:68 1:06 1:65 
| 


,,, In case the reader cares to check the calculations, Table II shows the rotati 
arithmetically, Since Cattell's bipolar-matrix is based on somewhat unsatisfact 
common factor-variances for the Several tests, it is not Possible to obtain a very 
to the * true’ values (particularly if an orthogonal transformation is used ) : 
based on a least squares fit, and append Cattell’s rotation matrix for compari 


оп matrix obtained 
ory estimates of the 
good approximation 
I give a transformation 
ison. 


TABLE П, TRANSFORMATION MATRICES 


Arithmetical Graphical 


| 
А. Based on Loadings in Table I | B. Based on Cattell's Loadings | 
in TI^ ІШ” IV4 | I Il’ ІШ” 


2 ” 
Е, Е, Е,” 


1 6873 23574 “6303 :0000 Е 77003 -3858 “6202 Е WM 453 — 
E a Rs :0000 E "6165 +1236 —.7643 F, :50 19 86 
E = г 7 Е. i; EN d Pg 2 се E 
s WERDE 0000 3 Epa ND 1515 Е, 50 84 19 


t will be agreed, I think, that the bi 


contrary, it clearly indicates the | i 
Suggests à group faci CREE 


c or with positive saturations in tests 1 
a 3 2 

and finally factor | suo und іп factor III) ; the supplementa 

appreciable overlap 


124 


Book Reviews 


Some psychologists will be inclined to ask whether these peculiarities in working methods may 
not be responsible for some of the rather unexpected factorizations that Dr. Cattell has at times 
obtained for his own empirical data (cf., for example, this Journal, П, pp. 114-30, and Dr. Banks’ 
criticisms and re-analysis in П, pp. 204-18) ; апа possibly a good many will feel that the artificial 
table thus chosen as the main illustration is scarcely typical of those which the majority of students 
will encounter in their work. Moreover, if the student possesses some knowledge of mathematics 
already, he would himself probably wonder why the elaborate calculations, described stage by stage 
in ch. III, IV, V, X, and XII, are needed for this table. If, in addition, he has already gained 
experience in group factor procedures, he would omit both the preliminary centroid analysis and 
the prolonged rotations, and obtain the final figures (those of Table II B) at the very first step. 

At the same time it would be unwise to press these criticisms too strongly. Hitherto most 
text-books on factor analysis have been chiefly concerned with the factorization of human abilities ; 
Dr. Cattell's book is welcome because he has adopted a much broader standpoint and has throughout 
kept well in mind the factorization of human personality generally. It is therefore conceivable that, 
in a fuller and more technical discussion, he might be able to adduce instructive arguments in favour 
of what at first sight seem rather inappropriate procedures. 

In any case, whether the reader accepts these methods and whether he agrees with the doctrines 
expounded or not, it is impossible to deny that the book itself provides one of the most stimulating 
introductions to the subject, and is full of novel, original, and often highly illuminating suggestions. 


E. N. Harper 


Tests Collectifs du Centre de Recherches. By M. Reuchlin and E. Valin. Paris: Institut National 
d'Étude du Travail et d'Orientation Professionnelle." 1953. Pp. 152. 400 fr. 


This monograph presents à study of the battery of tests compiled by the Research Centre of the 
Institut National d' Orientation Professionnelle. The first part summarizes the main practical conclu- 
sions ; the second and longer part deals with the more theoretical questions of methodology. Here 
the authors have “ devoted an important place to certain procedures which seem of special interest 
for such work, but have nevertheless been but little used in France," namely, the procedures usually 
called * factor analysis,’ ** to which," they say, “ our analysis appears to make some contribution.” 

“It was," they explain, “ between 1910 and 1920 that Burt found that correlations between 
cognitive tests could most frequently be accounted for by two types of factor—a general factor 
common to all and a series of group factors each common only to a limited number : this hypothesis 
was opposed by Spearman, who at first refused to admit the existence of group factors ; but it is 
equally opposed to the analysis into * simple structure” developed later on by Thurstone.” However, 
M. Reuchlin and M. Valin take the view that, in the latter case at any rate, the apparent conflict 
is by no means irreconcilable ; and they consider that their own figures clearly demonstrate that 
“the results obtained by Thurstone’s method (provided it is pushed to its final conclusion) lead to 
those obtained more simply by the method of analysis already described by Burt." It would be 
instructive to hear how far Thurstone and his followers accepted this conclusion. 

For the purposes of their main enquiry a battery of 15 tests was administered to nearly 2,000 boys 
and girls aged approximately 14 years. The children were divided into four subgroups—town boys, 
country boys, town girls, and country girls. A table of correlations was calculated for each group 
and then factorized by various procedures. Twenty-six pages of tables give their results in some 
detail. 
The writers begin with an ordinary “ bipolar analysis * based on the principle of * simple summa- 
tion. This leads them first of all to discuss the similarities and differences between Burt's * simple 
summation method ° and Thurstone's *centroid method.' The most important divergence, in their 
view, arises from the way the reduced self-correlations (or * communalities *) are assessed : Burt's 
method is to estimate the self-correlations by inspecting the general trend of the observed correlations 
(not relying solely on the largest coefficients), and then to correct these provisional estimates by the 
results obtained from an actual analysis, repeating the whole process until stable figures are reached. 
Thurstone, on the other hand, simply inserts the largest figure from each column, both when extracting 
the first factor and when dealing with the later factors. The French investigators consider that the 
procedure adopted by British workers “is much more satisfying in principle, but obviously more 
exacting to apply.” With the present data they find about four iterations usually sufficient ; and 
their final calculations, they believe, warrant the extraction of four factors. 

The bipolar factor-matrices reached by this procedure they. regard as somewhat difficult to 
interpret, because “ the negative saturations cannot easily be justified psychologically: in the 
cognitive field it is not easy to imagine two groups of tasks of such a nature that success in one group 
necessarily leads to relative failure in the other.” This argument, however, assumes that the factors 
obtained are to be identified, just as they stand, with cognitive abilities. But according to Burt— 
and the view has earned wide acceptance—factors are primarily principles of classification, and 
bipolar factors are primarily principles of dichotomous classification. On this basis the results can 


125 


Book Reviews 


at once be interpreted. In three of their groups, the first bipolar factor evidently contrasts tests 
requiring abstract or intellectual processes with those requiring more concrete or practical processes ; 
in the foi roup the same division is effected by the second bipolar factor. In each case the other 
Ыро! аг factor contrasts contents—namely, verbal with numerical and spatial with logical. 


To secure an interpretation in terms of positive abilities, the authors next proceed to ' rotate” 
the initial factor-saturations, first in accordance with the methods described by Thurstone and 


secondly in accordance with those originally proposed by Burt, A 


In Thurstone's procedure, the first stage is to obtain a * bipolar-matrix ’ of * centroid factors." 
The second stage consists in applying a succession of graphical rotations until some kind of * simple 
structure ’ is achieved, which will consist of group factors only with no general factor, The inyesti- 
gators offer very convincing evidence for regarding the results so reached as inadequate if not actually 
misleading. It is true that, by permitting * oblique " or correlated factors, Thurstone usually succeeds 
in avoiding anything like a general factor ; but this, it is argued, leaves readers quite in the dark 
about the extent to which the various tests chosen for the battery really tend to measure the same 
thing : in short “ they are kept in the same state of ignorance, and therefore exposed to the same 
Kinds of error, as in the days when factor analysis had not been developed, and test-results were 
neg Bed by mere inspection," Thurstone, ur are told, in à ra 10 simplify the natye 
schertie o! al and group factors (Lc., of ° general ability * and ‘ special aptitudes’), attempts to 
throw the entire onus of explanation on group factors or special aptitudes only, just as Spearman 
at one time hoped to explain everything by appealing to a general factor only, i.e., the factor of 
“general intelligence, But, it is argued, since the general or basic factor contributes quite as much 
to the total variance as the group factors do, the new type of * parsimony ' seems just as indefensible 
as the old. Why, simply on the grounds of personal preconceptions, should we ignore the useful 
general factor апу more than we ignore the narrower group factors ? The conclusion therefore which 
the authors draw is that “ whenever oblique factors are used for the first order analysis, a second 
order analysis is absolutely indispensable." - 


At this third stage, as they very justly assert, it is not sufficient to calculate just one general factor 
of second OSS the saturations for а/ the second order factors in all the tests should be computed, 
Unfortunately "' no American author using second order factors has, to our knowledge, pushed the 
analysis as far as this, although the necessary technique permitting this has been described by 
VITA plone ie SUM once uim be necessary to begin by estimating the diagonal values 
or 'sel-correlations ; for this purpose, they say, “nous avons estimé d'abord le 

par j smoothing,” sulvant la méthode décrite-A propos de l'analyse par ‘sommation ee 
E ане tous docto а rae successives." The final outcome of the Thurstone analysis 
"hai lot thee supplementary i the cae of ihe town bose ea el, besides the fret or 
ES Du Wis ths BUA рон cas own boys and town girls, four with the country 


In the last ease the additi factor 5 
Құмға fastor; its appearance is attributed to the face ditional factor really forms a second 


к et that (as not infrequently happens wi 
it 8 - ne 

Tharstone v prosedyre) the correlations between the oblique factors would leac to ІНДЕТ ІП 
over 14 0, if a single general factor alone were used. In the main the supplementary group factors 
thus obtained are very similar in nature to those which appear in the * simple structures ' : but they 
have the advantage of being orthogonal instead of oblique or correlated, The inter-correlations 
both of the tests and of the first order factors are now expressed—and better expressed—by a general 
08 uum eet за that “analyses into first order oblique factors, which dispense 
сае ог, cannot of themselves provide a satisfactory foundation either for theory or for 


The writers turn next to what they call the ‘ méthode 4 : i 
rs turi ) e Burt In th 

procedure outlined in the article on * Group Factor Analysis’ in this donar din соора 
agree that this procedure is far less subjective than that of Thurstone, but contend that the subjective 
element. is still not wholly eliminated, British factorists commonly assume that, if a sum- 
Bond косаи is carried out first and the resulting Sign-patiern adopted as a Buide (to be 
ogee RUN by the results of successive approximation), then the final subdivision of the 

is Of {ails IL groups and subgroups emerges automatically. Drs. Reuchlin and Valin feel that, 
owing to the lack of апу test of statistical significance, there may still be room for arbitrary decisions, 
when te eee aiara lons are extremely small, Moreover, in calculating the preliminary matrix 
0) hon-over’ipping group factors, they are a little dismayed to find that somewhat different cross. 
ci ассои арреаг with the different groups of children ; for instance, two of the tests involving 
Bum er are classed sometimes with the tests requiring numerical ability and sometimes with the 
B o Ц оак, But, since each test involves both processes, and since boys and girls (or urban 
anara pues) may conceivably make use of either process in varying degrees, it is not surprising 
tot : uch discrepancies ; indeed, so far from constituting defects of the method, they probably 
indica: 5 Lu EUM un me way e popil tend to attack such problems. 
алу doubt, however, the full factorial procedure should, if possible, be followed 
and then every decision will be determined by the data. If, for example, dose the initial stages 


126 


Book Reviews 


a low negative saturation leads to an incorrect grouping, this will be mechanically corrected when the 
non-overlapping factors are replaced by overlapping factors. No doubt in practice these elaborate 
precautions are seldom employed. Unless the numbers tested are extremely large, the minor changes 
that would result could, as a rule, claim only moderate probability, and consequently it legitimate 
to shorten the procedure, as indeed the French investigators have done. Where a many 
iterations are required, they point out that the procedure can be greatly facilitated, ““еп procédant, 
comme Burt le fait souvant, par condensation." or by graphical devices. But probably most psycho- 
logists would hold that, whenever the doubtful points are of importance, the wiser course would be, 
not to spend time on more refined calculations, but to plan a new and more suitable experiment 
to solve the issues so raised. 

At the later stages, where rotations are required to show the extent of. overlapping, the French 
writers prefer (with most British investigators) to rely mainly on an arithmetical method of rotation. 
For the final figures, however, instead of continuing the iterative procedure when only slight modifi- 
cations are likely to result, they have ** effected a few slight orthogonal rotations with the graphical 
procedure," In doing so, they are inclined to wonder whether their procedure may not be " une 
utilisation maladroite,” departing. a little too far from the rigorous ideal, But, as Burt and others 
have observed, provided the rotations are carried out * blindly’ and in accordance with perfectly 
objective principles, graphical procedures may be used (by those who prefer them) even when the 
computet wishes 10 obtain û matrix of general and OEE group factors which are perfectly 


orthogonal. ‘Though seldom so accurate, ey are occasiona specdier than arithmetical iterations. 


‘The results at which the investigators finally arrive arc highly instructive, All the factor-matrices 
clearly reveal the tendency towards е recurrent pattern of g, v, t апак" which so often appears 
in British work. In the tables obtained from the country boys, there is first a general or basic factor 
accounting for nearly 70 per cent. of the common factor-variance (rather over 40 per cent. of the 
total variance) and four group factors —verbal, numerical, spatial, and (the smallest of all) reasoning. 
Moreover, in these instances there is practically no overlapping. ln two groups, however, the reasoning 
factor shows а tendency to fuse with the numerical and the spatial factors, In the results obtained 
from the country boys and from the country girls, there is a large group factor covering the * intellec- 


217 tests (as distinct from the * practical’), and this then tends to subdivide into separate verbal 


and numerical factors which do not overlap. 
But the finding on which the writers lay greatest stress is the. remarkable similarity in every 
roup between the factor-matrices obtained with Burt's relatively simple methods and those сусп- 
i ally reached with Thurstone's second-order analysis when carried through all four seges 19 из 
tually reached Wih ре то methods therefore do not really lead to inconsistent results as is 40 
otten aup (Те Шөп ^is possible, Further, ^ il nous pm important et équitable de 
се son! 


exigé que Durl modifie sy eoneeption première * 


а: а e 
juli gulls Je tion n'a pis ӨНЕ E En 
souligner que Cot 7 pep pm se ranger, à SON avis, Y 
" Durant 


ses * contradicteurs " qui а 


ће cal method adopted is nol quite the same as that used by the Frensh authors, 
т in sugh caves the graphical mem сас based on non-overlapping group factors, the computer 
nse T ins with a preliminary arithmetical rotation by à standard transelassifieantory, operator. 
ШЕН "equilibrated orthogonal mit i far its nature and use, see this Journal, 
І (1947), p. 8.) ( , ai is LE Toner) 
> that, what would seem to be the earliest published example of factor-rotation, 
1t may be поа ple graphical procedure (Proc. Roy. Soc., A, XCVI, рр. 91f.). Burt also used 
mns n nan ‘the first large American investigation on the ‘Correlation of Mental Abilities " 
jt in factorizima Education, 1912). Indeed, it was Spearman's subsequent 


i ia Contributions to 
(Simpson, В. ne eon factors found in this table (Abilities of Man, 1925, pp. 145-6, 151) 
critic demonstrated the need for a more rigorous procedure or establishing such factors. 


127 


INDEX TO VOLUME 


Thematic Apperception Test, A Preliminary ae of the 


VI 


- INDEX OF AUTHORS 
Authors Titles Page 
Scale Analysis and Factor Analysis . 5-24 
me с Тһе Relative Merits of Ranks and Paired Comparisons 5 112-119 
Dunsdon, M. T and The Relation of the Terman-Merrill Vocabulary Test to Mental 
Roberts, 7. А. Fraser Age in a Sample of English Children 61-70 - 
Ehrenberg, А. 5. С. On Making Statistical Assumptions 41-42 
Eysenck, Н. J. . The Scientific Study of Personality . 44-46 
Gourlay, N Covariance Analysis and its Applications ‘in Psychological 
Research . 25-34 
Guttman, L. A. Note on Sir Cyril Burt's ‘ Factorial ‘Analysis of | Qualitative 
: с Data’ 5 1-4 
Leyden, Т. Тһе Identification and Invariance of Factors 5 119 
Morris, R. E. The Relative Merits of Ranks and Paired Comparisons 112 
Richardson, L. F. The Submissiveness of Nations . 77-90 
Richardson, L. Е. . Оп the Obligation to Refute Rival Hypotheses. 107 
Roberts, J. A. Fraser, The Relation of the Terman-Merrill Vocabulary Test to Mental 
and Dunsdon, M. I. Age іп а Sample of English Children . 61-70 
um Elizabeth H. Changes in Terman-Merrill 1.Q.s with Dull Children 71-76 
ea A. Preliminary Study of the Thematic Apperception Test 91-100 
Slater, P : The Factor Analysis of Negative Correlations . 101-106 
Whitfield, J. W.. The Distribution of Total Rank Value for One Particular Object 
in т Rankings of n Objects 35-40 
Wrigley, J. The Publication of Tables 52 
INDEX OF SUBJECTS 
Titles Authors Page 
Covariance Analysis and its Applications in QUSE Recah Gourlay, N. . 25-34 
Factors, The Identification and Invariance of Leyden, T. cap 9 
Hypotheses, On the Obligation to Refute Rival Richardson, L. F. . 107 
Nations, The Submissiveness of . : Richardson, L. F. . 77-90 
Negative Correlations, The Factor Analysis. of . , Slater, P. 101-106 
Personality, The Scientific Study of . Eysenck, H. J. 44-46 
Qualitative Data, “А Note on Sir Cyril Burt's Factorial Analysis of? Guttman, L. . 14 
Ranks and Paired Comparisons, The Relative Merits of Morris, R. E 112 
Ranks and Paired Comparisons, The Relative Merits of - Burt, C, 112-119 
Rank Value, The Distribution of Total, for One Particular Object Whitfield, J, ху. 25-40 
Ic An Пун and F actor A "Analysis Burt, C. 
EA PF A ur 5 
қарап итро, 0 ТЕЛІ, + Ehrenbr ALS. с. ЛЕН 
Даша Mis ate h D i hi T р Wrigley, " 
erman-Merrill 1,0,5, Changes in, with Du 1 Пе Sear, Elizabeth H. 71-16 
"Terman-Merrill Vocabulary Test, Relation of, ental Age in à EREA M. L. nd 
Sample of English Children 


Roberts, J. А. апа, 61-70 


Sen, Amya 91-100 
BOOKS REVIEWED 
Authors Titles æ Page 
Brower, D., and Abt, L. E. Progress in Clinical Psychology om 55-56 
Burt, C, , Contributions of Psychology to Social Problems 57 
Burt, С. à е Causes and Treatment of Backwardness 58 
Cattell, R. B. > Factor Analysis — . 120-125 
Pues T, . The Young Delinquent i in his Social Setting 2-56-57 
‘ontributions to Mat tical S; н +> 53-54 
Reuchlin, M., and Valin, E, Tests Collectifs du наадаг DUE я 5 12 in, 


128 


